How can I download all "front pages" found by google?

Author Message
Andrzej 06/08/2004 01:06 pm
Hi
Is it possible to download all front pages found by google? I enter a search term e.g.:

http://www.google.com/search?q=metaproducts&ie=UTF-8&hl=pl&lr=

and I would like to download all pages that are found, but only the "front pages", only the one pages directly linked from google (not whole site) - e.g:

http://www.metaproducts.com/MD.html

and not the pages linked from that page. Is it possible? I thought so and my setting was:
Project: Level Limit: unchecked
Url Filters: Server:
Load files only within the starting Server
Load up to 1 links on other servers


Is something like this possible in Offline Explorer Pro? If yes, what`s wrong with my settings?

Thanks.
Andrzej 06/08/2004 02:22 pm
Ok, I hope that I found the mistake in my config - I unchecked "Other" in the file filters, when I check it again, it seems to work ok. Sorry for wasting your web space ;)
Andrzej 06/08/2004 06:08 pm
Oh, no good, now it download too many pages :(
Where is my mistake? I only want:
a) all result pages from google
b) all pages found by google (only this reffered directly by google, one document, not whole site)
Andrzej 06/08/2004 07:30 pm
Ok, I hope the problem was finally resolved (I forgot to exclude "more from this site" link from download... )

BTW: can somebody tell me, if "URL Filters" main section (with description "Disable URLs from downloading") accept wildcards ("*", "?") or not? Thx.
Oleg Chernavin 06/09/2004 09:12 am
I hope that your Project loads exactly what you need now. If not, please let me know what unwanted links does it follow.

Regarding Disabled URLs - no. It doesn`t accept any wildcard. The reason is simple - I added this section for high performance only. Processing keywords takes quite a lot time and Disabled URLs section may include thousands of links. To use keywords, wildcards, etc. please use URL Filters | Filename section - Custom Configuration.

Best regards,
Oleg Chernavin
MP Staff