Downloading posts in a forum

Howard
04/08/2010 04:40 am

Hello Oleg and community,

I have used OE for 10 years and have not requested technical support for a long time. It has been a very versatile and indispensible tool for me

For the first time in a long time, I am having difficulty solving an issue and so I am seeking technical help from you.


I will be very appreciative of your help.

I am trying to download the archives of a forum in politics.

The address is
http://sandiego.craigslist.org/forums/?forumID=20

I do not want to download the complete posts. I only want to download each of the pages that list the daily posts.

They are numbered: 1,2,3,4 etc. at the top of the address "http://sandiego.craigslist.org/forums/?forumID=20"

The URL for each page appears in the status bar of the browser.
Page 2 for example is:
http://sandiego.craigslist.org/forums/?act=DF&forumID=20&node=&batch=2402458&old=yes

Page 11 is:
http://sandiego.craigslist.org/forums/?act=DF&forumID=20&node=10&batch=2402188&old=yes

The part of the URL that is always the same is
"http://sandiego.craigslist.org/forums/?act=DF&forumID=20&node="

Then there is an incremental increase beginning with 10, then 20,30,40,50 etc.

The part of the URL that increases by 10 is:
"10&batch", "20&batch","30&batch","40&batch","50&batch" etc.

Each batch has 10 pages which contain lists of postings for 10 days.

The part of the URL that contains the address for a particular day comes after
"http://sandiego.craigslist.org/forums/?act=DF&forumID=20&node=10&batch="

and before

"&old=yes"

Can you help me with the correct macro or regular expressions I need to use to download the archive of every day of posts?
I only want to download the daily lists and not the individual posts.

Thank you in advance and I look forward to hearing back from you when you are able to find the time for me.
Oleg Chernavin
04/08/2010 08:58 am
I think, the easiest way would be to create a Project with the URL:

http://sandiego.craigslist.org/forums/?act=DF&forumID=20
Level=100

URL Filters - Filename - add to the Included list:

forumID=20&node=

This should be enough to do the download.

Best regards,
Oleg Chernavin
MP Staff