I am having a problem which is probably related to a couple of others recently listed.
Running OE Pro V8.3.2048
Trying to download http://www.sing365.com/
Am now starting OEP with /nomap /nodb options and have Additional=DepthFirst in the url field.
Download runs ok for a while then gets slower until it virtually stops
At this point it also becomes impossible to run anything else on the machine.
About once a minute the PC comes back to life for a few seconds and then stops again.
If I reduce the active session count to 1, things improve, but only for a while.
If I try to suspend to disk, there is lots of disk access but the project is not saved.
If I kill OE, have to use task manager, I find all my 1G of RAM is used and most of my 2G swap file.
When OE runs ok to start with, the queue stays small, as it should do using Additional=DepthFirst.
But when it slows down, the queue start to grow, becoming very big.
Also the parsing count begins to increase.
The part of the site that I have downloaded works ok, so the general process is working ok.
I seem to remember something like this happening a couple of years ago, I remember there was a memory leak at one point. Maybe you could find a hint in your old development notes.
In my many years with OEP I have only had two problems, both nasty, but you eventually managed to fix them, hope this one does not prove so bad :-)
Can you please tell me how many files are in the Queue when it gets very slow?
I have now tried again, with the following conditions:
Part of site is already downloaded
Threads set to 10
Removed - Additional=DepthFirst.
Monitoring with Task Manager.
Within 30 seconds system has stopped responding.
Responds for a couple of seconds every minute or so
Task Manager reports no significant CPU or memory usage
Parsing reports ~500, download ~1000, Queue ~1000
After a few seconds the system responds normally
Network activity stops
Parsing, Download and Queue numbers all continue to increase and show similar sized counts.
When I had to leave for work they had all got to around 100,000 and were still going.
CPU usage average 10%, peaking to 30%
Memory usage not significant.
I have never seen OEP behave like this before :-(
Its taken 12 hours for OEP to sort iself out.
Its finally got down to 0 in the parsing queue but now has 200,000 in the download box and 1,200,000 in the queue.
I have reduced the threads to 2, as you suggest, and am restarting the download.
Ran with 2 threads for half an hour
About 500 in parsing qeue
Parsing que increased to over 10,000 and took another 12 hours to clear
Started download again
OEP runs ok as long as the parsing queue stays small i.e. below 10
If it goes above this it will not clear, just grows.
Only way to get it down is to pause download again.
When there is more than a couple of items in the parsing queue, the system stops responding.
You get a couple of seconds life every minute.
Have now left it running with a single thread.
Downloaded ~ 700,000
Queue ~ 2,300,000
Turned this off as you suggest, but still get very large parsing file count and OEP falls over after a few days of slow running :-(
Then turned everything off under Advanced, except for error suppression.
Set to 1 thread
Filtered out all the advertisement sites which I can now identify :-)
Started, and immediatly got high pasing count.
Downloaded and pasing queue counts continued to rise
After a day or so, pasing queue went to zero and I un-paused run.
OEP has now run for over a week, very slowly, 1 download every few seconds.
Pasing queue stays at zero, downloaded queue went to about 500,000, download queue about 750000
Last night OEP fell over again
Looked at the queue directory, only 500 files of about 500k each, so not very big.
No db or url files in config directory
I notice you have done some upgrades, are these likely to help ?
Only doing this now as a challenge :-)
The site does work from my PC now, but there are still bits missing.
I have just been reading back through your previous answers and the same thought had occured to me :-)
Is there any way to break up big directories in a way that will still allow the site to access them and OEP to understand them ? or do I just have to start again :-(
Went through the downloaded site again, deleted the rubbish and added a few more filters.
Found one directory with 25.000 files :-( but far more interestingly, found another directory containing 500,000 other directories, each containing one file.
What effect does such a number cause ?