problem: unwanted downloading pages from another sites
|anakunda||01/13/2005 08:40 am|
I want to download whole site which is located in a directory on a server.
I`ve setup new project, all content types are checked (text to match URL filters, everything else to get from any url), no level limit.. The URL filter is set to starting server only, starting directory and below, all files.
Now the problem:
I`ve noticed collecting many web pages from another server (yahoo.com) in the queue so it apparently parsed a page from this server. So I`m quite confused how is it possible those urls passed the URL filters settings when I stated starting server only for text content.
Can I get some advice how to setup URL filters for such case properly?
Yet one question: Is it possible to update extensions for existing templates from previous versions by default set of newer version? (ie. new page extension gets used).
Thanks in adv.
|Oleg Chernavin||01/13/2005 09:03 am|
|Please look in each of the File Filters sections of the Project - the Location box should be set to "Load using URL Filters".
Regarding templates - you can adjust them using the File | Templates dialog easily.
|anakunda||01/13/2005 09:34 am|
|Thanks for quick response,
It helps but I guess this will restrict content of any type to load from the starting server - is this right? I`ve tried to add all those unwanted domains to URL filter list and it worked also.
In my project I would like to disable any filtering for Image, Audio, Video, Archive, User defined and Other type so they could be downloaded if referred from my project site, but to keep filtering Text so it won`t download and parse for example URL from my project server: http://order.sbs.yahoo.com/ but it could download an image from URL like http://order.sbs.yahoo.com/logo.gif (if this is pointed to from my project site), is there a workaround for that?
|Oleg Chernavin||01/13/2005 10:14 am|
|Please try to make the Other File types category to be loaded using URL Filters. I think, this will be enough to skip following outside pages.
|anakunda||01/13/2005 01:41 pm|
though I don`t understand much why the external page URL did not match Text category,
this workaround worked exactly.
|Oleg Chernavin||01/13/2005 03:49 pm|
Some HTML pages on external sites use non-standard extensions, which are not listed in the Text category. This is why the Other category should be also limited to use URL Filters.