How to avoid downloading same page with different urls?
|Mercie||03/03/2006 06:05 am|
|I think I am recieving a huge problem. So this site I`m downloading changes its urls every few hours from:
These two are exactly identical pages with just different urls. How do I avoid it downloading the same pages over and over.
|Oleg Chernavin||03/03/2006 06:11 am|
|What about URL Substitutes to place all files in the same server?
Uncheck the rule.
|Mercie||03/03/2006 06:38 am|
|That would work, but I forgot to mention that It would also parse the same pages over and over. How would I stop it from parsing the same pages over and over?
Because I think it`s still parsing the same pages with just different urls over and over.
Example of what I mean is:
It parses http://324lkj23l.sitetourl.com/watever.html
then it also parses
Which are identical files, but the url keep on changing every few hours meaning it will keep parsing the same urls over and over.
How do I stop it from parsing the same url over and over? My queue list has over 2,000,000 urls and I know for sure that the site doesn`t have that many.
|Oleg Chernavin||03/03/2006 09:11 am|
|I am afraid, there is no solution right now. Sorry.