Major Memory leak

Author Message
Michael 03/29/2006 07:01 am
On large sites, with at least 1000-2000 urls, there is a huge memory leak when parsing already found urls.

I`ll try to explain - I use OE to fetch the large site (with the option of not to download existing files) - at first run it works great! but on second run (when it parses files that were already downloaded) - it grabs more and more memory until it collapse (virtual memory gone).

Any help?

Thanks
Lena Klimova 03/29/2006 07:01 am
Can you try the latest version here:
http://www.metaproducts.com/download/betas/oep1035.zip

Unzip the file and replace the old oe.exe file with the new one.

Best regards,
Oleg Chernavin
MetaProducts corp.
Rich Houser 03/29/2006 07:01 am
I downloaded OE Pro 2.6.1040 r1 and replaced my older version of OE Pro 2.6 with it. It did better, but the problem persists. Additionally, there appears to be an endless loop condition after OE Pro downloads many files, this latest version that point is around 11-12 thousand.

I`m running XP on a dual processor machine with 512 GB RAM this version will use 100%+ of one CPU, grab 11-15 GB virtual storage. Memory usage always goes up, but never comes down, and after a certain point hangs. All activity on the screen freezes, OE Pro quits doing any screen repainting, but task manager continues to show 100% of one CPU being used.

This happens consistently on www.wickedweasel.com. I have set the options to stay within the domain, download only graphics, archives, and video files, with no limit. The older versions. pre 2.6, worked fine on this site, though exihibited the memory problems sited here and in the other folks` email.

Regards,
Rich Houser

> Can you try the latest version here:
> http://www.metaproducts.com/download/betas/oep1035.zip
>
> Unzip the file and replace the old oe.exe file with the new one.
>
> Best regards,
> Oleg Chernavin
> MetaProducts corp.
Oleg Chernavin 03/29/2006 07:01 am
Probably, this is because of the Project Map. Can you please try to run Offline Explorer Pro from the command line this way:

oe.exe /NoMap

and then start downloading that Project. Would this help with the memory usage?

Oleg.
Michael 03/29/2006 07:01 am
What If I need project map?

> Probably, this is because of the Project Map. Can you please try to run Offline Explorer Pro from the command line this way:
>
> oe.exe /NoMap
>
> and then start downloading that Project. Would this help with the memory usage?
>
> Oleg.
Oleg Chernavin 03/29/2006 07:01 am
This is only for a test to see if disabling Project Map helps downloading or not. I need to find out what causes such slow down to work on optimizing the code.

Oleg.
Michael 03/29/2006 07:01 am
Oleg hi-

when I specify /nomap the memory usage seems much stable (no leak or very high values).
what are the pluses/minuses of using maps?

Thanks

> This is only for a test to see if disabling Project Map helps downloading or not. I need to find out what causes such slow down to work on optimizing the code.
>
> Oleg.
Oleg Chernavin 03/29/2006 07:01 am
They are important for: Export, Delete Project files, Import/Export, Data mining and many other things.

Do you face a big difference when using /NoMap parameter or it is just slightly different?

Oleg.
Michael 03/29/2006 07:01 am
Oleg,

> They are important for: Export, Delete Project files, Import/Export, Data mining and many other things.

What about speed? do they lower speed downloading on large/small sites or increases it?

> Do you face a big difference when using /NoMap parameter or it is just slightly different?

Yes. a major difference. without /NoMap paramter there is a memory leak of some sort - offline explorer gets more and more RAM until windows dies (low virtual memory and crash). with /NoMap specified offline explorer uses a stable amount of memory while parsing files at about the same speed.

I`m using the option of - Download only modified files (with check filesize option on)

Hope this helps,

Michael.
Oleg Chernavin 03/29/2006 07:01 am
Does this happen with Enterprise or Pro version? What if you start downloading without /NoMap parameter, but do not click the Map tab? When Map is not activated, it should not "eat" that much RAM.

Oleg.
Michael 03/29/2006 07:01 am
Oleg,

> Does this happen with Enterprise or Pro version? What if you start downloading without /NoMap parameter, but do not click the Map tab? When Map is not activated, it should not "eat" that much RAM.

I never used the map tab. I`m using Enterprise.

thanks,

Michael.
Oleg Chernavin 03/29/2006 07:01 am
OK. I will continue looking for it.

Oleg.
John 03/29/2006 07:01 am
Hi!

I`m still looking for a solution for my problem, and it seems like this.
I use winXp too, but OE pro.

I`ve tried get underdogs.org what have about 8000-10000 urls without external links.
I`ve pointed that OEpro stop parsing files and appending queue list after a while (about 4000file links in queue), but parsed php files would contain even more urls.
At the second time that local partition had less free space (my swap file is on that partition) and this time queue collected only 2800 links.
Could it be possible to setup OE before start, whether keep queue list in memory or a local file?
If no duplication filtering work actively on queue list (not used continuosly), then the list could be stored in a local file.
Oleg Chernavin 03/29/2006 07:01 am
Offline Explorer doesn`t keep the whole queue in memory. It keeps only up to 2000 files in RAM and the rest is being saved to disk and retrieved from it when the queue is less than 500 URLs again. Can you come with some simple example on how to reproduce the problem? For example, link http://........file.php contains 1000 links, but only XXX are placed to the queue? I will check what`s wrong there.

Oleg.
John 03/29/2006 07:01 am
Of course the situation is the next.

I`ve tried to make a local mirror from h**p://www.the-underdogs.org
This is the web server. Every file what related to this site is on server h**p://files.the-underdogs.org.
On the www server are about 7500 php file like "h**p://www.the-underdogs.org/game.php?id=1234" what have a lot of links inside. Every php has at least 1 links inside what point to downloadable files or another php files. (I made filters to get links only point to file.the-underdogs.org server or filenames what point to links like game.php?id=)
Well, the most important is: I`ve mentioned that OEpro started do it right,
but something goes wrong at the working process.
Every php what not excluded by filters placed into the queue (of course), and when the engine parsing it, every links - what not excluded - appending to the queue (of course), this is the right way.
But, after a quantity the engine seems not to salvaging more links from php files, however they have included links inside. I`ve checked.

I`ve tried some other ways, and I mentioned the next thing:
On my pc I have only one partition. At first time it has about 800Mb free space, and queue list collected about 4500 links into the queue.
The second time this partition had less (about 600Mb) free space and OEpro get only to about 2800 links.
At first I`ve thought OEpro makes pre-enumeration with queued files sizes. But I thought it would not be correct.
Then I`ve read this topic about OE using high amount of virtual memory.
My paging file size is fixed, so currently it`s size is 512Mb (min/max).
I`ve thought OE could not request more memory, so at the end of available memory OE stops adding more queue items.
Now, only for testing I`ll make about 10Gb free space, and start OEpro on this environment with the same settings. Now I`m interested in solving this problem.

I hope I could be clear...
John 03/29/2006 07:01 am
Hello

I`ve been testing OEpro with /nomap parameter, so it seems to be working.
6105 file queued and collecting more...

Thank you.
Oleg Chernavin 03/29/2006 07:01 am
I improved certain things in Offline Explorer. Can you please test the updated version with the Map enabled (please do not click the Map tab before or during the download to make the "clean" test, so Windows tree control doesn`t eat memory)?

Here is the updated Offline Explorer Pro version:

http://www.metaproducts.com/download/betas/oep1092.zip

Enterprise edition:

http://www.metaproducts.com/download/betas/oee1092.zip

Please let me know the results.

Oleg.
Sven Wunder 03/29/2006 07:01 am
I think I have the same Problem,
one thing i have seen on my System is, that if the Problem occurse, the parsing-cont in the lover left corner goes up until the last page is read, put these pages are never analysed.
I also have seen the dependency of the amount of parsed sites to the free memory at the system-partition, but my free memory display (Norton System Doctor) has shown me enougth free memory.
I have tried the /nomap parameter an the newest beta.
And the craysiest thing at last, sometimes it simply workes...
My System is WinXp with 512MB an a 1400 Athlon.