Pro changing its mind

Author Message
Rykk Adams 12/01/2010 12:51 pm
Hello,

I'm having dozens of problems with pro. The pertinent one for this post however is how will load a site, then I run and update, and the site is broken, lots of 404 errors, even on the root page sometimes. It is very frustrating

I'm trying to download www.erfworld.com. Apparently, the back end of the site is very poorly designed, because I'll get recursive links that go on into infinity. But that's a problem with the site not the program. But why does the page give me 404s after the 3rd time downloading a page when it worked fine the first couple times. Then, the only way I can get the page to load correctly again is to delete the entire project and start over.

Why do I have to re download the same page over 50 times, and still have an incomplete site? I click download missing files, and even though hundreds of files are obviously missing, because it's a book that is hundreds of pages long, it will only que 68 files then stop. So I increase the Level Limit, but that makes no difference, downloading a few more files, but the pages don't load. Except that where I could view an index page 1 level deep, now I get a 404 error.

Example. On the main page, there is a drop down menu that directs to 3 index pages. The first few times I downloaded, the offline page led to the index pages just fine, and even showed the pictures. But when I tried to click on the thumbnails I got a page not available error. So I increased the level limit, which did nothing. After increasing the level limit several times over what should be necessary, then the index page stops loading.

Any ideas how to fix?
Oleg Chernavin 12/01/2010 02:31 pm
I loaded this site with Level=2 and the drop-down box and links on the thumbnails worked fine. How do you update the downloaded site - using the Download All Files or Download Only Modified and New Files in the Project Properties dialog?

I downloaded it 3 times and most of the downloads were correct (200 OK and 304 Not Modified).

Best regards,
Oleg Chernavin
MP Staff
Rykk Adams 12/01/2010 02:59 pm
Download only modified and new files.

Start>Update Project and
Start>Download missing files
Oleg Chernavin 12/01/2010 03:10 pm
If you Download Missing Files, then 404 errors are normal - because there are some links (not many) that were extracted from scripts. These links are incorrect, so the server replies with the error.

However Update Project should load most of the links with 200 OK and 304 Not Modified correct answers.

Oleg.
Rykk Adams 12/02/2010 01:12 pm
Part of the problem here is that on reloads the thumbnails don't work even if I don't try to navigate within the page, and instead navigate by the map. If I click on the thumbnails they lead nowhere.


I'm at sea for months at a time, and sometimes only get an hour online for a month or two, so I'm trying to make my computer as productive as possible. "Downloading all" is counterproductive since most of the internet connections are quite slow, often less than 50Kbps. So I want a strictly, "What's new" update without having to re download the entire site. Or rather, if I wanted to just DL the entire site, I'd have used one of the free offline managers. Right now I'm back in the states, so I can experiment to try to get a good script.

When I hit start, what is it doing? Does it default to re-dl the entire site, or update the site? If it defaults to entire site, is there a way to make it default to update? I just clicked on start and it has downloaded 3000 files already after downloading the site already. It's too easy to just click on start, especially if I've not used the program for a while because I've been weeks at sea.

I tried what you said, and while it seemed to work fine. If I clicked another page of thumbnails, say by clicking on 8 at the bottom of the index page, I started running into problems again. Basically, only the initial thumbnail page worked. So I increased the level to 3. And it works, the entire site downloads now, the index pages can't be navigated to directly, but I can use the map to pull up the index pages. Unfortunatly, when I try to update it attempts to update the entire site.

These are my options:
address:www.erfworld.com
File Modification Check: Download only new and modified files/Check size/Don't dl existing media
File Filters: Text, Images, Archive, Other, User defined all checked, all Load using URL filters settings
Ignore logout links.
This time I didn't uncheck any of the extensions in any of the filters.


Url Filters:
*wp-login.php*
Load only within the starting server
Excluded directories /*wiki*/, /*store*/, /*forum*/, /*toolbox*/, /*support*/, /*login*/
Load only within the starting directory

Everything else is at defaults.

When I try to update the project it still attempts to re-download 2737 files when I click on update site. It even re-downloads all the .jpgs even though don't dl existing media is checked and the names didn't change.

Also, I'm not getting this DL status bar. When it finishes it will say, "Download complete, 24 files" size 382Kb, but it downloaded over 2000 files and sucked over 100kbps of bandwith for over 15 minutes. Where is the directory this is held, so I can determine how much memory it is taking?

Thank you,
Oleg Chernavin 12/05/2010 12:20 pm
To see the exact directory, open the Project Properties dialog and just press Ctrl key. It will be displayed in its bottom.

Download directory is specified in the Options dialog - Files section. Or in the Folder Properties dialog.

Regarding the update - please give the exact Project settings. Because with what you described it should not even try to update the image files.

Please select the Project, use Export - Project Settings - Copy To Clipboard menu. Then paste it into the forum message. I will try to reproduce this.

With your file modifications check settings, Offline Explorer should ask for every web page - whether it was changed since the last download or not. If yes, the page will be loaded and its links are followed.

Oleg.
Rykk Adams 12/07/2010 03:34 pm
Thank you for all your help. I'm sure it's going to turn out to be something rediculously simple.

Stream 1.2 File
[Object]
OEVersion=Pro 5.9.0.3228
Type=0
IID=7012
Caption=Erfworld
URL=http://www.erfworld.com/
MVer=5
Lev=3
Weekday=257
LimTSize=10000
LimNumber=5000
LimTime=100
CheckSize=True
SkipMedia=True
FTText.Exts=htmlhtmaspaspxjspstmstmlidcshtmlhtxtxttextxspxmlrxmlcfmwmlphpphp3
FTImages.Exts=gifjpgjpegtiftiffxbmfifbmppngipxjp2j2cj2kwbmplwf
FTVideo.Exts=mpgavianimpegmovflvfliflcvivrmramrvasfasxwmvm1vm2vvobsmilmp4
FTAudio.Exts=wavriffmp3midmp2m3uravocwmaape
FTArchive.Exts=ziparcgzzarjlhalayleirarcabtarpakacejarpdftgzexe
FTUDef.Exts=jsaxdcssssivbsdtdxslswfclassent
FTText.B=ooxooo
FTImages.B=ooxooo
FTVideo.B=xoxooo
FTAudio.B=xoxooo
FTArchive.B=ooxooo
FTUDef.B=ooxooo
FTOther.B=ooxooo
FTSizes=0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0
NotIgnoreLogout=False
RSrvsBx=1
RPathBx=1
RPathEx=/*wiki*//*forum*//*login*//*support*//*store*//*toolbox*/ xxxxxx
RProt=255
LastStart=36:89:36:188:146:200:227:64:
LastEnd=38:236:200:206:146:200:227:64:
LastStarted=12/4/2010 2:03:04 PM
LastEnded=12/4/2010 2:06:20 PM
S200=53
S304=2339
S400=276
SPar=1624
SSav=53
SLast=304
SSiz=1480025
SMdf=53
SHTML=48
LFiles=2668
LSize=19581730
ImgDim=0,0,0,0
PrevURL=http://www.erfworld.com/
SkipURLs=/*wp-login.php*/
ConvertRSS=True
LIndexed=False
IndexFiles=False
Oleg Chernavin 12/07/2010 04:01 pm
I tested this Project, but it updates pages correctly. It really loads web pages again, but only because the server doesn't output their modification dates or size.

No image is really loaded during the update. Perhaps, you saw URLs like:

http://www.erfworld.com/book-1-archive/?px=%2F019.jpg

These are not images, but Web pages that show the image in the center.

To make sure we have the same version, please download the updated oe.exe file here:

http://www.metaproducts.com/download/betas/oep3295.zip

Unzip the file and replace the old oe.exe file with the new one. Please let
me know how it works. Thank you!

Oleg.