For example, suppose I wanted to prevent OE from spidering any page on www.somedomain.com. As I understand it, I could accomplish this by entering:
http://www.somedomain.com
in the Skip URL`s box
However, couldn`t I also accomplish the same thing by creating a Server filter with the keyword "www.somedomain.com"?
Is there any difference between these two methods?
http://www.somedomain.com/
- please pay attention to the ending slash.
Excluded server keywords should contain:
www.somedomain.com
and it will do exactly the same - stop downloading anything on the specified server. The difference is that the Skip URLs accepts only full URLs to be skipped, while other URL Filters sections can use keywords (parts of URLs with wildcards, etc.)
The reason to add Skip URLs feature was to make processing long lists of exclusions really fast while downloading files. Keywords take much more time to process.
Best regards,
Oleg Chernavin
MP Staff
OK, just to clarify:
When you say that it will "stop downloading" does this mean that it will stop spidering the site entirely, or will it continue to spider the site as per the Level setting, but not save the excluded pages?
For example:
* Suppose I enter http://www.somedomain.com/index.htm in the Skip URL`s box and suppose that page contains a link to http://www.anotherdomain.com.
* Now, further suppose that OE comes across a page that contains a link to http://www.somedomain.com/index.htm
I realize that it won`t download the somedomain.com page. However, will it still parse that page and follow the link to anotherdomain.com, or will it ignore the somedomain.com page entirely and not spider any of its links?
Oleg.