How to follow linked pages?

Author Message
ssieloff 03/29/2006 07:01 am
I am trying to automatically download all the pages (1 - 19) and the detailed links contained on each page relating to the business listed -- but not all the other links on each page.

Any ideas? I just want to end up treating the 19 pages as 1 continuous HTML page and then download only the business detail links ignoring all other links on each page.

Here is page 1 of the site:
http://www.palmcity.org/memberlisting.asp

Thanks for any assistance the forum can provide!

Steve
Alexander 03/29/2006 07:01 am
Dear Steve,

Thank you for writing us.
Please use the following link for your Project:

http://www.palmcity.org/memberlisting.asp?offset={:0..360|20}

Also, please add "memberlisting.asp" keyword to Project properties | URL filters | Filename section, Custom filename configuration | View included files keyword. This allows to avoid downloading of non-necessary files.

Sincerely,
Alexander.
| Alexander Bednyakov
| Senior Developer
| MetaProducts Corporation
ssieloff 03/29/2006 07:01 am
Alexander -

A thousand thanks -- I think I`m getting the hang of the tool!

One last question -- the following link is a POST for a license query -- I need to vary the query for terms aa% thru zz% to defeat the limit of 1000 returned records per query. Can you show me how to vary the search query for an alpha range beginning at p_fname=aa% end ending with p_fname=zz%? (aa%, ab%, ac% ... zx%, zy%, zz%).

Also, the referrer and cookie usage -- how to accomplish (I tried the standards shown in manual and did not get them to work).

http://www.dora.state.co.us/pls/real/ARMS_Individual.Search_Results
POST=p_board=&p_license_number=&p_lname=&p_fname=aa%25&p_bus_city=&p_board_prefix=
Referer=http://www.dora.state.co.us/pls/real/ARMS_Search.Set_Up
SetCookie=SITESERVER=ID=1518aec01b60ba33419022ddf7de3306; SITESERVER=ID=1518aec01b60ba33419022ddf7de3306

Thanks again for your assistance in these questions!

Steve

> Dear Steve,
>
> Thank you for writing us.
> Please use the following link for your Project:
>
> http://www.palmcity.org/memberlisting.asp?offset={:0..360|20}
>
> Also, please add "memberlisting.asp" keyword to Project properties | URL filters | Filename section, Custom filename configuration | View included files keyword. This allows to avoid downloading of non-necessary files.
>
> Sincerely,
> Alexander.
> | Alexander Bednyakov
> | Senior Developer
> | MetaProducts Corporation
>
ssieloff 03/29/2006 07:01 am
Alexander --

Can I do the following to solve the dual alpha range?

p_fname={:a..z}{:a..z}%

I am hoping it will iterate the 2nd {} from a to z for the initial {a}, then change the initial {} to {b} and loop the 2nd {} from a to z, etc.

Still unsure on how to best handle the setcookie and referrer issues!

Thanks,

Steve

> Alexander -
>
> A thousand thanks -- I think I`m getting the hang of the tool!
>
> One last question -- the following link is a POST for a license query -- I need to vary the query for terms aa% thru zz% to defeat the limit of 1000 returned records per query. Can you show me how to vary the search query for an alpha range beginning at p_fname=aa% end ending with p_fname=zz%? (aa%, ab%, ac% ... zx%, zy%, zz%).
>
> Also, the referrer and cookie usage -- how to accomplish (I tried the standards shown in manual and did not get them to work).
>
> http://www.dora.state.co.us/pls/real/ARMS_Individual.Search_Results
> POST=p_board=&p_license_number=&p_lname=&p_fname=aa%25&p_bus_city=&p_board_prefix=
> Referer=http://www.dora.state.co.us/pls/real/ARMS_Search.Set_Up
> SetCookie=SITESERVER=ID=1518aec01b60ba33419022ddf7de3306; SITESERVER=ID=1518aec01b60ba33419022ddf7de3306
>
> Thanks again for your assistance in these questions!
>
> Steve
>
> > Dear Steve,
> >
> > Thank you for writing us.
> > Please use the following link for your Project:
> >
> > http://www.palmcity.org/memberlisting.asp?offset={:0..360|20}
> >
> > Also, please add "memberlisting.asp" keyword to Project properties | URL filters | Filename section, Custom filename configuration | View included files keyword. This allows to avoid downloading of non-necessary files.
> >
> > Sincerely,
> > Alexander.
> > | Alexander Bednyakov
> > | Senior Developer
> > | MetaProducts Corporation
> >
Oleg Chernavin 03/29/2006 07:01 am
Hello,

Yes, the p_fname={:a..z}{:a..z}% will iterate through all aa..ab..ac..za..zz combinations.

Regarding Referer and Cookies - try not to place them there. If this will be not accepted by the server, the best way would be to get fresh values from the server when you add a new Project via Internal browser (Ctrl-Alt when clicking on a link).

Best regards,
Oleg Chernavin
MetaProducts corp.