Duplicate files

Author Message
Chase 03/22/2006 10:36 pm
Hello,

I`m having OE pull images from a website, but the website generates the filenames dynamically with the date down to the minute. I`m getting duplicate files because of this..

I don`t know how OE works, but I`m guessing it pulls up the page with the file more than once.. the first time it days "ok, xxfile.. add that to download queue" then it does back to the same page and see`s a "new" file and adds that one to the queue! That`s just what I`m guessing.. I don`t see why else I would be getting duplicates.

Any idea? Thanks!
Oleg Chernavin 03/23/2006 02:30 am
Well, there is a way - you can remove the date from the URLs you download to make all duplicates simular and to avoid downloading the same files twice. You can use teh URL Substitutes feature for that (Project Properties dialog - Advanced section).

If you will have problems setting it up, please tell me examples of URLs with such dates.

Best regards,
Oleg Chernavin
MP Staff
Chase 03/23/2006 05:54 am
Thanks for the response.

I`m not sure if I`m not understanding you.. or maybe I didn`t explain my problem correctly?

The only thing that I am download is images.

Now, the pages have lots of thumbnails for the images, I have OE follow each thumbnail to the window that contains that single image. Now, when that window gets opened, it displayes the full resolution image and the website dynamically generates the filename with a time/date stamp (maybe down to the second). Now, if you were to re-open that page again it would have a different timestamp, hence it looking like "a unique file" to OE.

This is what it looks like:
8005840carl144323200653118AM.jpg

8005840<-PhotoID
carl144<-username
323200653118AM.jpg<-M/DD/YYYY/H:MM:SS

So, does OE load the pages more than once, and add the supposed "new" file to the queue?

Is it possible to cut everything AFTER the photoid for the filename? Because that`s unique and static.


Thanks!!!!
Oleg Chernavin 03/23/2006 06:38 am
Well, if all images contain the carl144 username (no other username is used), then the rule can be:

URL:
*.jpg
Replace:
carl144*.jpg
With:
.jpg

If there ae could be other usernames, then use:

URL:
*.jpg
Replace:
2006*.jpg
With:
.jpg

Oleg.
ZS 03/23/2006 07:21 am
Thanks for your help, again.

But this seems to change the url that it`s trying to download..?

That won`t work... Is there a way to change the filename it saves as? That way it will just skip any existing duplicate..

For example,
8003620insaneses323200670008AM.jpg

Can we instead save that as 8003620.jpg?


Thanks a million!!!! :)))))



Oleg Chernavin 03/23/2006 07:25 am
Yes, simply uncheck the rule and it will be applied to the filenames, not URLs. You will need to select Do not load existing files in the Properties dialog.

Oleg.