My organization is trying to download opinion/editorials from the Washington Post such as this:
http://www.washingtonpost.com/ac2/wp-dyn?pagename=article&node=&contentId=A33312-2003Dec26¬Found=true
This is 1 level down from a Yahoo! Opinion & Editorial archive with a link like this:
http://us.rd.yahoo.com/dailynews/fc/World/mideast_conflict/opinion___editorials/SIG=1253nf48s/*http://www.washingtonpost.com/wp-dyn/articles/A33312-2003Dec26.html
As you can see, the Wash Post`s web server converts the Yahoo! link to a dynamically generated page (I think), which doesn`t seem to get followed by OE 2.9. My map for www.washingtonpost.com looks like this:
[ac2] (empty)
+[wp-adv] (advertisement stuff)
- [wp-dyn]
-[articles]
A10067-2002Oct10.html
A11135-2002Apr7.html
...
...
A9254-2002Jun6.html
+[opinion]
+[wp-srv]
As you can see, [ac2] never gets populated, even though ultimately that is the folder on WashPost`s server where the html file resides.
Any help is much appreciated.
Regards,
Marc
I followed the second link, but Yahoo told me that there is no such page. Can you tell me a link to all Yahoo.com Opinions and Editorials?
Best regards,
Oleg Chernavin
MP Staff
http://story.news.yahoo.com/fc?tmpl=fc&cid=34&lp=1&ll=b1&pg=1&mod=opinion___editorials&in=world&cat=mideast_conflict_archive
and click on one of the Washington Post links.
> Marc,
>
> I followed the second link, but Yahoo told me that there is no such page. Can you tell me a link to all Yahoo.com Opinions and Editorials?
>
> Best regards,
> Oleg Chernavin
> MP Staff
I would also suggest you to use URL Filters | Filename | Custom configuration to add two keywords to the Included filename keywords:
http://*yahoo.com/*www.washingtonpost.com*/*
http://www.washingtonpost.com/*/*
This will filter exactly the pages you want to download.
Oleg.