Different downloaded pages

Author Message
Chris 10/04/2006 07:39 am
Hi!

I noticed some days ago some strange downloaded pages and I don't know why this happened. I tried something and I know now why this is so, but I don't understand the reason why it is different.

Short description (example):
I'd like to download
http://www.pspwallpapers.com/download.php?image_id={:20286..20294}

The files that are saved on my harddrive are JPG-files:
download.php@image_id=20286
download.php@image_id=20287
...

When I suspend the download to a file and later on I want to resume the download from that file, following files will be saved on my harddrive:
download.php@image_id=20290
24_series.jpg
download.php@image_id=20291
20_4.jpg
...

The JPG-files are now extra files and in the download.php-files there is a "302 File moved" and a refresh-call with the JPG-picture.

Could you tell me, why there is a different saving of the files after resuming download or could you correct it, if it is a bug, because it's not very comfortable to handle this mixture of files.

Thanks in advance,
Chris
Oleg Chernavin 10/04/2006 09:14 am
The server supplies an alternative filename for these URLs and Offline Explorer saves these files under that new name and old files (with 302) ae kept for linking purposes only. You can suppress this behavior by adding the following line to the Project's URLs field:

Additional=SkipDisposition

This will force Offline Explorer not to use the alternative filename.

Best regards,
Oleg Chernavin
MP Staff
Chris 10/04/2006 10:59 am
Ah, superb! :-)
Now there are only the .php-files, no matter if I start download normally or by resuming it from a file. But why is it without using this command (by the way: I can't find this command in the help) at first when I start the download normally different than after resuming the download from a file. Is there a special reason why you do so?

Just for my interest: is there also a command that does the opposite (save the .php-files with 302 and the pictures as picture-files with the alternative filename)?
Oleg Chernavin 10/05/2006 06:24 am
The problem is that when yo use URL Macros and the starting URLs get this alternate filenames, Offline Explorer doesn't create alternate-named files. This was done because there are sites that have the same alternate filename for all URLs requested. This is not the case for your site, but in some cases we faced a problem that this alternate file was overwritten by all URLs downloaded.

This is why we decided to change that logic.

Oleg.
Chris 10/05/2006 07:47 am
OK, then I have to accept, that for my site at a normal started download only the php-files are saved and when I suspend the download to a file and later I resume it from the file that the php-files with 302 AND the alternate files are saved and that I must use the Additional=SkipDisposition-Command if I don't want this.

As I'm now in contact with you I'd like to tell/ask you some things that made me become a little bit confuse for the last weeks:

1) Suspend-button in Toolbar
It's everything ok when I click at the small black down-arrow on the right of the suspend-button and choose the project to suspend. Then the small icon on the left of the project-name shows, that the project is suspended at the moment.
But when I press F9 or click on the suspend-button to suspend a download, the download suspends but the small icon on the left of the project shows the green arrow instead of the yellow pause-symbol. So I don't know on a short look if it is now suspended or not and I have to look at the message panel if there is a download progress for this project.
Do you also have this situation or is it only on my computer?

2) File copies
The "URL Substitutes"-function is a very useful function for me. So I was at some time in a situation where I asked myself if there is also a possibility for this: the "Advanced"-"File Copies"-function renames the filenames if there are files with the same name in a directory. But the links in the downloaded pages are not changed. So, for example, a new file that already exists in a directory will be saved with the new name, but the link shows on the old (wrong) file and not to the new file, because the link will not renamed. Does OEP support not only the renaming of the file but also renaming the link? Or are you still working on it? Or isn't it possible to realize this?

3) Content filters
Example:
I want do download this URLs (level limit 0):
http://80.240.229.144/sixcms/list.php?page=articleteaser_at&article_id=15923&_level1=1562&_level2=15923&_lang=at
http://80.240.229.144/sixcms/detail.php?template=article_solbrownie&id=8468&_level4=8468&_artid=8468&_dsid=&_level2=1744&_lang=at
http://80.240.229.144/sixcms/testhtmlpage.php

Content filters:
Keywords: "no template given"
checked: "Do not save any pages that contain the above keywords"
all others unchecked

As I understand it, OEP will look if the phrase "no template given" exists in the downloaded files. If the phrase exists, the file will not be downloaded, if the phrase doesn't exist it will download the file.
So the testhtmlpage.php should not be saved, the other two files should be saved.
But when I start the download, none of the three files will be saved. What's going wrong?

And another thing: what's the difference between "Save all pages that do not contain the above keywords" and "Do not save any pages that contain the above keywords"? Doesn't this mean the same?

My questions are not urgent, but I'm always happy on your quick response :-)

Thanks, Chris
Oleg Chernavin 10/05/2006 11:34 am
1. Yes, I have to work more on this logic of the Suspend button/submenu. Sorry for the confusion.

2. File Copies works this way - when a new file gets loaded and the same filename already exists, the old file gets renamed and the new file gets written to the disk. So, there is no need to change the link, because the link will always point to the newest file.

3. In this case, all files will be downloaded, but if the keyword is present inside the text of some file, it will be not saved to the disk.

The difference between these two is first also means to save pages that contain keywords (by default). Checking the second will actually stop saving any pages (if the first is unchecked).

The meaning is that by default pages with keywords get saved, with no keywords - discarded. Checking both of these items, you revent the behavior - no keywords in pages get saved, with keywords - discarded.

This is useful when you don't want to save pages with unuseful information, like "Page not found".

Oleg.
Chris 10/06/2006 06:29 am
1) No problem.

2) Ok, the oldest file has the highest number, the newest file has the original filename (I should look more often to the help, I thought it would be otherwise).
But there's still a problem with this if I use the "File copies"-function together with the "URL Substitutes"-function in a special way.

Let me explain as short as I can with a self-made example: if I have a HTML-file with a list of links like this:
link.asp?type=product&heading=Productname1&Session=sdjk6zb7823rnf22fg2g
link.asp?type=product&heading=Productname2&Session=sdjk6zb7823rnf22fg2g
link.asp?type=product&heading=Productname3&Session=sdjk6zb7823rnf22fg2g
...
and I want to rename all links with "URL Substitutes"-function to "link.htm" so that I have a very short filename.
When I download this and I activate "File copies"-function, the renamed files will be saved as
link.htm
link_1.htm
link_2.htm
...
but the links in the HTML-file with the list of links will only point to "link.htm" and not to "link.htm", "link_1.htm", "link_2.htm"...

Sorry that I couldn't find a better example at the short time, but I hope you know now what I mean.
And I'm afraid that this is a very complex thing that is not possible to realize?!

3) I understand now (I thought, that nothing happens if nothing is checked, so I always checked something).
Checking nothing means that only files with keywords are saved. Checking both of the actions "Save all pages that do not contain the above keywords"/"Do not save any pages that contain the above keywords" means, that all files with keywords are discarded.

Thanks, Chris
Oleg Chernavin 10/06/2006 09:00 am
I see this. I would suggest you to use another URL Substitutes rule instead:

URL:
link.asp
Replace:
link.asp***Productname**&*
With:
link**.htm

Oleg.
Chris 10/06/2006 09:28 am
Yes, this is probably the only way for this. And it's also very comfortable.

So there is almost everytime a way to do what I want to do and it doesn't matter if not everything can be done, but you're very close with OEP. It's one of the best programs I know.
And your forum answers are always quick and useful, the monthly program updates keeps the user up to date and the functionality of OEP is already huge - and this is wonderful :-)

A big THANK YOU for the great software you make and keep happy going on with it!

Thanks for your support,
Chris
Oleg Chernavin 10/09/2006 08:15 am
Thank you for your kind words!

Oleg.