Downloading different pages with the same address.

Author Message
User 07/13/2004 07:01 pm
I am trying to create an offline copy of a site which uses php scripts. The problem I`m having is that the php files don`t get everything they need from the address. Some of it comes from somewhere else. This results in multiple pages having the same address, even though they are different based on what link was clicked to get there. How can I create a successful copy of such a site?

The only way I can think of is to download every link found regardless of what it points to and save each one with an extra value added on to the name so that the offline copy will point to the proper links. If that is the only way, how would I go about doing it?
Oleg Chernavin 07/14/2004 01:38 am
Can you please tell me, what exact site you are trying to download? I will see how it organized and suggest you what to do. Please also let me know which particular links do not work offline on that site.

Best regards,
Oleg Chernavin
MP Staff
User 07/14/2004 05:45 pm
This is the member section of mac and bumble, so I don`t know if you`ll be able to see the pages themselves. I would be happy to upload or send some of the pages downloaded by OE or IE if that would help. If so, just let me know where to upload/send them.

Here`s an example of what happens:

On the page "mb/gallery.php?mg=769&size=large", there is a link to "mb/gallery.php?v=0".
On the page "mb/gallery.php?mg=508&size=large", there is also a link to "mb/gallery.php?v=0".
When clicked, these two links will not display the same page.

In IE, if I open the first page, then open the second page in a separate window, then click on the link on the first page, I will be sent to the page that the second page should have sent me to instead of the first. So I believe it is based on the most recent page that was downloaded to the local machine.
Oleg Chernavin 07/15/2004 04:23 am
It looks like the site uses referer to create page contents. Unfortunately, there is no solution right now. We will think about it, anyway.

Oleg.
Franz Muell 07/27/2004 08:18 am
> It looks like the site uses referer to create page contents. Unfortunately, there is no solution right now. We will think about it, anyway.
This looks more like different cookies, but the problem remains the same.
OE would have to include referer and cookie value into the name of the
local file copy to overcome this problem, which would be total nonsense
for most other downloads.
Oleg Chernavin 07/27/2004 08:32 am
I can do it, but what we will have in the result - OE will try to download all possible combinations of all pages with all referers if it finds such links on pages. So, many of the site pages will have the same contents and way many copies, especially "Home" page. I am afraid that this will bring more problems than solutions.

Oleg.
Oleg Chernavin 07/27/2004 10:26 am
It looks like you simply copied the previous message.

Oleg.
Franz Muell 02/11/2005 07:43 am
> It looks like you simply copied the previous message.
Sorry, when I first pressed the send button the reply was
"document contains no data" so I clicked again. Then you
were so fast with your answer that my second try was
registered later.
There is a similar problem also with filenames which are
different on the server but the same on the pc.
A page on a unix server could contain four different
images aa.jpg, aA.jpg, Aa.jpg, and AA.jpg which would all
be mapped to the same name on a windows machine.
Did you ever think about a name mapping for this type
of problem ?
Oleg Chernavin 02/11/2005 08:00 am
Yes, I thought about it, but I do not have a solution for it yet. Sorry.

Oleg.
LegoDruid 08/07/2005 04:57 pm
> On the page "mb/gallery.php?mg=769&size=large", there is a link to "mb/gallery.php?v=0".
> On the page "mb/gallery.php?mg=508&size=large", there is also a link to "mb/gallery.php?v=0".
> When clicked, these two links will not display the same page.

</GASP!> After spending an hour in the MPSC forums I finally found someone reporting the same problem. Unfortunately, Oleg has made it clear that there`s no solution. These two links result in a file download with the same name: Laetitia.jpg

http://www.sketchbooksessions.com/thedrawingboard/download.php?id=6937
http://www.sketchbooksessions.com/thedrawingboard/download.php?id=13987

The files are linked from different servers, but because of the Download.php widget, OE thinks that they`re coming from the main server and saves them in the same directory. Which results in all but the last-downloaded version of the file being clobbered.

Can I rename the downloaded files? If I could affix the "id=6937" to the Laetitia.jpg filename, or the file`s datestamp, or even just a random number, it would fix the problem.

Unfortunately, without being able to differentiate files (from different sources) with the same name, I cannot use OE to download these PHP forums; having the downloaded image files scrambled is worse than having them not download at all.
Oleg Chernavin 08/08/2005 07:48 am
One way to try is to have File Copies enabled in the Project Properties and run Offline Explorer this way:

oe.exe /NoURLs

However this way Offline Explorer may download the same links many times again and again, so you have to watch the Queue all the time.

Oleg.