Following links question

ssieloff
03/29/2006 07:01 am
Hello -- your help is tremendous and I would appreciate your help on this download.

I want to download an entire site`s database of notary publics. A generic name search of "%%%" will result in the entire database being exposed 10 records at a time. Unfortunately, the "Next 10" link has the last Individual`s name key embedded in it as a positional anchor for the server to display next 10 records from search result set.

Here is the initial search screen query:

http://www.sec.state.la.us/cgibin/?rqstyp=NTRA&rqsdta=%25%25%25

And the result page looks like this:

Louisiana Secretary of State
Notary Name Search Results
J0HN P. RESTOVICH , Non-Attorney,CADDO , Commissioned: 04/30/1979

ROBERT J. AALBERTS , Unknown,CADDO , Commissioned: 02/04/1985

ANNETTE D. AARON , Non-Attorney,RAPIDES , Commissioned: 08/16/2000

BELINDA H. AARON , Non-Attorney,EAST BATON ROUGE , Commissioned: 05/12/1995

DEBORAH D. AARON , Unknown,RAPIDES , Commissioned: 05/15/1987

DON AARON, JR. , Attorney,ACADIA , Commissioned: 05/18/1959

DONALD J. AARON , Unknown,ACADIA , Commissioned: 01/26/1940

H. MICHAEL AARON , Attorney,ASCENSION , Commissioned: 12/16/1976

JULIUS G. AARON , Non-Attorney,NATCHITOCHES , Commissioned: 04/26/1983

JULIUS G. AARON , Non-Attorney,CADDO , Commissioned: 02/09/1998



Previous 10 | Next 10 | New Search | Index

You can see the "Next 10" link is as follows:

http://www.sec.state.la.us/cgibin?rqstyp=ntranxt&rqsdta=AARON+JULIUS+G+++++++++++++

Each page`s "Next 10" will look similar but the rqsdta= parameter will be the last name key from each group of 10. I can only assume they are using this key as a row reference pointer into the master result table to get the next group of record.

How can I automate the traversing of these links using OE Pro so I can extract the links on all pages in a single process?

Thanks!

Steve
Oleg Chernavin
03/29/2006 07:01 am
This should be easy - create a new Project with an unlimited Level and the following starting URL:

http://www.sec.state.la.us/cgibin/?rqstyp=NTRA&rqsdta=%%%

Then go to URL Filters | Server and select Load from the starting server, URL Filters | Directory - select Load from all directories. URL Filters | Filename - select Custom configuration and add the following to the Included keywords list:

cgibin

Click OK button and you can start loading the Project now.

I hope this helps.

Best regards,
Oleg Chernavin
MetaProducts corp.