Download gettng bigger each day

Author Message
Neil 05/22/2007 02:03 pm
I have OE set to download a certain page to the depth of 1 link each day into separate folders by date.
Each day the folder size increases and, at first I put it down to more links being added to the page but now I'm getting worried.

For instance the sizes of the last 5 downloads are 466, 474, 488, 495 and 521 MB respectively. The first download was 65 MB.

What is the reason for this, is there a setting I should know about to keep the size down?

Thanks.
Oleg Chernavin 05/22/2007 02:21 pm
Do you use File Copies feature in the Project Properties dialog? If yes, then it will keep older copies of all files and this will increase the folder size.

Best regards,
Oleg Chernavin
MP Staff
Neil 05/22/2007 02:33 pm
Would that be under Advanced>File Copies? If so, I've got nothing checked in there.

Thanks.
Oleg Chernavin 05/22/2007 02:56 pm
It could be the site structure. For example, session IDs in URLs - they will make older files to be kept on the disk as well. It is hard to tell without looking at the situation. What if you would rename the download directory and make a new download - would the download and folder size become around 65 MBs again?

Oleg.
Neil 05/22/2007 05:01 pm
I ran the same download to a new driectory, as you suggested and I was identical.

I've tried to look into this in more detail and it does seem as if all new downloads are being added to previous stuff. The page in question is the TimesOnline lifestyle page. The biggest single folder in the download is named 'www.timesonline.co.uk', inside there, 'tol' is the biggest and then 'comment' followed by 'obituaries'. The obituaries folder that downloaded on 2 March (as an example) contained 135 objects. The obituries folder that downloaded today contained 1035 objects. This seems to happening in all other folders as well.

Hopefully this info will give you some insight into the problem and maybe there is nothing I can do about it.

Thanks.
Neil 05/22/2007 05:48 pm
Just a thought, my date expression is - 2007\05May\{:0day};101. That wouldn't affect anything, would it?
Oleg Chernavin 05/23/2007 04:47 am
Yes, it loads files with new filename each day and they get added to the previous download. Perhaps, you need to delete all previously downloaded files that were not affected in the new download?

If so, you may add the following line to the Project's URLs field:

Additional=DeleteOldFiles

Oleg.
Neil 05/23/2007 12:03 pm
Thanks, that seems to work. Today's download was back to 95MB.
Oleg Chernavin 05/23/2007 01:30 pm
OK. Good.

Oleg.