Does not download links in a page

Author Message
Jam Bull 10/30/2005 08:09 am
I`m trying to download a book from a free site:
http://www.hti.umich.edu/cgi/t/text/text-idx?c=philamer;iel=1;view=toc;idno=ars2519.0001.001

This page has a table of contents. The project settings I used are the default. And yet the pages of the book were not saved. What`s the problem? What settings should be used?
Jam Bull 10/30/2005 08:19 am
Here are the project settings I used:

[Object]
OEVersion=Enterprise 3.8.0.2048
Type=0
IID=7069
Caption=http://www.hti.umich.edu/cgi/t/text/text-idx?c=philamer;iel=1;view=toc;idno=afu8696.0001.001
URL=http://www.hti.umich.edu/cgi/t/text/text-idx?c=philamer;iel=1;view=toc;idno=afu8696.0001.001SingleURL=http://www.hti.umich.edu/cgi/t/text/text-idx?c=philamer;cc=philamer;rgn=full text;page=viewtextnote;idno=AFU8696.0001.001
Lev=2
Weekday=257
LimTSize=10000
LimNumber=5000
LimTime=100
FTText.Exts=htmlhtmaspjspstmstmlidcshtmlhtxtxttextxspxmlrxmlcfm
FTImages.Exts=gifjpgjpegtiftiffxbmfifbmppngipxjp2j2cj2k xoooooooooooo
FTVideo.Exts=mpgavianimpegmovfliflcvivrmrvasfasxwmvm1vm2vvob
FTAudio.Exts=wavriffmp3midmp2m3uravocwma
FTArchive.Exts=ziparcgzzarjlhalayleirarcabtarpakacejar
FTUDef.Exts=jscssssivbsdtd
FTText.B=xoxooo
FTImages.B=ooxooo
FTVideo.B=xoxooo
FTAudio.B=xoxooo
FTArchive.B=xoxooo
FTUDef.B=xoxooo
FTOther.B=xoxooo
FTSizes=0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,0
RSrvsBx=1
RProt=63
LastStart=254:216:174:180:251:223:226:64:
LastEnd=70:87:213:219:251:223:226:64:
S200=382
S304=3
S400=7
SPar=315
SSav=68
SLast=200
SSiz=652995
SMdf=68
LFiles=392
LSize=3094337
ImgDim=0,0,0,0
PrevURL=http://www.hti.umich.edu/cgi/t/text/text-idx?c=philamer;iel=1;view=toc;idno=afu8696.0001.001
Oleg Chernavin 10/30/2005 12:29 pm
This is simple - please open the Project Properties dialog and check the File Filters | Text, User Defined and Other categories. Then go to URL Filters | Directory and select "Load from the starting directory and below". Click the OK button and download the Project again.

Best regards,
Oleg Chernavin
MP Staff
Jam Bull 10/31/2005 07:16 am
I did as you said but still the pages (gif) were not saved. The pages are in the following:
http://www.hti.umich.edu/cache/afu8696.0001.001/

Maybe there is some code or something in the site that blocks mass downloaders. Is there any way around this? I tried to use the address above in offline explorer but I get a "Forbidden" message.
Oleg Chernavin 10/31/2005 07:24 am
Please go to the File Filters | Images section and check all extensions there - you have GIF files disabled now. This is why Offline Explorer doesn`t save them.

Oleg.
Jam Bull 10/31/2005 10:36 am
I changed the program settings using the default but still could not download the images. Help!

Caption=Iloko
URL=http://www.hti.umich.edu/cgi/t/text/text-idx?c=philamer;cc=philamer;q1=igorot;rgn=full%20text;view=toc;idno=ADL4452.0001.001
Lev=6
Weekday=257
LimTSize=10000
LimNumber=5000
LimTime=100
FTText.Exts=htmlhtmaspaspxjspstmstmlidcshtmlhtxtxttextxspxmlrxmlcfmwmlphpphp3
FTImages.Exts=gifjpgjpegtiftiffxbmfifbmppngipxjp2j2cj2kwbmplwf
FTVideo.Exts=mpgavianimpegmovfliflcvivrmramrvasfasxwmvm1vm2vvob
FTAudio.Exts=wavriffmp3midmp2m3uravocwmaape
FTArchive.Exts=ziparcgzzarjlhalayleirarcabtarpakacejarpdf
FTUDef.Exts=jscssssivbsdtdxslswfclass
FTText.B=ooxooo
FTImages.B=ooxooo
FTVideo.B=ooxooo
FTAudio.B=ooxooo
FTArchive.B=ooxooo
FTUDef.B=ooxooo
FTOther.B=ooxooo
FTSizes=0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,3,3,0,3,0
RProt=127
LastStart=247:233:120:36:31:224:226:64:
LastEnd=120:43:95:47:31:224:226:64:
S200=89
S400=1
SAbr=100
SPar=89
SSav=89
SLast=200
SSiz=133972
SMdf=1
LFiles=90
LSize=134252
Stopped=True
ImgDim=0,0,0,0
PrevURL=http://www.hti.umich.edu/cgi/t/text/text-idx?c=philamer;cc=philamer;q1=igorot;rgn=full%20text;view=toc;idno=ADL4452.0001.001
Oleg Chernavin 10/31/2005 01:13 pm
I tried to load it, but the site has strange way to render pages. It is easier to load all images from it with a single URL:

http://www.hti.umich.edu/cache/adl4452.0001.001/00000{:001..200}.tifs.gif

Does this help?

Oleg.
Jam Bull 11/01/2005 04:08 am
It worked! You`re a genius! Thanks much!
Jam Bull 11/01/2005 05:17 am
I`ve discovered another thing. I can only download the files if I viewed them in my browser!
I think the cache directories are actually empty. They only get filled up when somebody accesses the files. I tried the trick you gave on another book and no images were saved. But when I browsed a few pages, OE saved those pages. Perhaps the images are compressed. Then they get decompressed into the cache. So it seems I still have to manually browse ALL the images before I can OE them.
Oleg Chernavin 11/01/2005 10:52 am
Yes, it looks like the site depends on what link you clicked and cookies define which image to output, but the above trick should still load all of the images regardless of whether you browsed them before or not.

Oleg.