URL filter questions

10/31/2008 12:14 pm
Hi - two questions:


I''m trying to follow and download html pages that (for instance) have ''jane'' in the server name OR filename:


I am also excluding some server keywords like ''comments'' in http://comments.theserver/*/*/janepage.html and some keyword exclusions in directories and filenames.

What would be great is the functional equivalent of [''jane'' and not (''comments'' or ''journals'')] anywhere in the url.

Any suggestions?

Question 2:

I''ve never understood the relationship between included and excluded keywords. For instance - what if the URL contains some excluded keywords AND some included keywords? Does one over-ride the other? And if I''m using both included and excluded what happens if a url does not contain any included or excluded keywords? Will such a url be downloaded or not? Perhaps the answer is not to use both included and excluded at the same time?

Thanks for your help,
Oleg Chernavin
10/31/2008 01:00 pm
You need to use URL Filters - Filename keywords:


(any server with comments in server name, any directory /*/ and any filename * after last slash.



(two keywords in the Included keywords list - the last one means any file with jane in filename).

Exclude overrides include. The general rule is the following - if any filter or limit doesn''t allow a file to be downloaded, it will be skipped.

Best regards,
Oleg Chernavin
MP Staff
11/02/2008 10:32 pm
Your filter suggestion works perfectly. Thank you.

Just to clarify on one last point. If I have zero include filters then all url''s are allowed (subject to exclude filters etc.). If I have one or more include filters then it will only allow urls that pass the include filter criteria. Is that correct?

Thanks yet again - you are a wonderful resource.
Oleg Chernavin
11/03/2008 03:12 pm
Yes, empty included list means all are allowed. If you add even one keyword, all links that do not match at least one keyword in the list will be not followed.

11/08/2014 12:18 pm
hi there
i want to filter all url contain "type=5" like http://www.example.com/Default.aspx?type=5&year=1393&month=8&day=17
i add *type=15* filter into url filters-> filenames but dont work .
why? please help me
Oleg Chernavin
11/08/2014 12:18 pm
What if you would use the keyword:


(type=15 looks a mistake to me).