problem with download only part of the foruml
|wai||11/09/2008 08:44 pm|
|Hi , i have a question if OE can do the following
I would like to download only a certain part of the foruml , and any threads within this paticular part of the foruml.
This is the URL of the foruml www.ck101.com ( this site is in chinese so i am not sure if u can nevigate this site properly.)
How this foruml structure is they store everything on the root folder , and everything have a fid an tid .
a section of the foruml i am interest in is
http://www.ck101.com/forums/forum-237-1.html , which the last -1 is page 1 of this section , which list all the thread within this section. it have 72 pages so http://www.ck101.com/forums/forum-237-1.html to http://www.ck101.com/forums/forum-237-72.html all 72 pages is basically all i am looking for. I would like to have all the threads ( include all the pages inside the threads ) being download.
the problem is all the actualy threads are as follow
http://www.ck101.com/forums/thread-1327170-1-1.html ( page 1 of a thread)
http://www.ck101.com/forums/viewthread.php?tid=1327170&extra=&page=2 ( page 2 of the same thread)
Is there a way to limit OE so that it will only download let say all pages with forum-237-*.html and all the direct link from those pages and their sub pages ( ie all pages with 1327170 ).
Reason of that is i currently having problem with it linking back to the main page and thier going to all other part of the foruml , i can''t really limited them to the part i want since they are all store on the same level of the domain www.ck101.com/forums/
After read some post here in the foruml , i figure data mining from all the pages in from forum-237-*.html and base on the data from all those thread # , download them all only with those #''s might work , but after reading a bunch of help file and so i , i couldn''t figure this out myself.
Hope you can help.
|Oleg Chernavin||11/10/2008 06:26 am|
|I think, you need to make a Project with the following URL:
and Level=1 - this will load all pages of the thead and all pages that are linked to them.