I'm trying to download a website and skip (don't save) the files that contain these two strings:
<TITLE>302 File moved</TITLE>
I've set the "Content Filters" dialogue in Project Properties like this:
-- Text keywords
Keywords: "<title> 404" "<TITLE>302 File moved</TITLE>"
Search for all keywords: checked
Search inside HTML tags: checked
-- When keywords are not found in the page
Save these pages: checked
All the other checkboxes are left unchecked.
My assumption is that works like this:
1) Page is downloaded
2) Parser searches for "<title> 404" and "<TITLE>302 File moved</TITLE>"
3) If none of these is found, page is saved, otherwise it's discarded
However, even with these settings, the pages containing "<TITLE>302 File moved</TITLE>" are still saved, so I guess it works in a different way. Can you please help me with finding out where my settings are wrong?
You need to uncheck the Search for all keywords box. Because when it is checked, Offline Explorer requires both of these words to be present in a single web page. And as I understand, only one of these lines can be in a web page.
So, it is either 302 or 404.
Would this work?
However, I find this dialogue (especially the "Search for all keywords" option) counter-intuitive from the user standpoint. I'll try to explain.
Let's say I have two strings that I want to filter out. So I put them in the "Keywords" field and then check "Search for all keywords", because that's what I want to do - search for all of them and only do the action if none is found. Then I go to "When keywords are not found in a page" and check "Save these pages".
So in my thinking, I checked the options that mean "search for all keywords and when none of them is found, save page". However, it doesn't work that way, which might be slightly confusing.
I think it would be way more intuitive if there were two separate checkboxes - "Only apply when all keywords are found" in "When keywords are found in a page" and "Only apply when none of the keywords are found" in "When keywords are not found in a page".
Or, even better, a radio buttons to switch between logical AND and logical OR - so it would be possible to choose between "Only when all keywords are found" and "When at least one keyword is found" in "When keywords are found in a page" and between "Only when none of the keywords are found" and "When at least one keyword is not found" in "When keywords are not found in a page".
That way, it would be easy to understand what the filter logic is going to do and how it will apply the rules.
Just a suggestion :)
Thanks again for helping me out.
I added these options. Can you please take a look at the updated version:
Please let me know if it is OK or anything should be improved/fixed. Thank you!