I was spidering anime.desktopnexus.com, downloading all Images and Video only, including SWF, since it is a video too.
OE ran for 5h, with 9 connections and then crashed, BOOM. The same crash method, that was present since v6.0.
Since I still have 20+ jobs to run, do you have any crash analyzing tool? I saved 2 appcompat logs and made 2 screenshots, but I think that may not be enough or not even anything near useful... The crash seems to be in 90% same - kernel32.dll related. And sometimes it's about "thread" something...
I can tell you this much, that I couldn't find a pattern in downloaded size and RAM consumption, so it may still be caused by some parsing module...
I think you'll get the crash yourself, when you try to spider anime.desktopnexus.com...
If I get more crashes, should I report those websites, maybe you'll find a pattern in common in all of those sites?
In worst case scenario.. this might even be hardware related..
Also, what's custom Agent identification's max length? I can't use any of my browser's real, valid IDs, it will make OE do a bad job...
For example, if I input:
Mozilla/5.0 (Windows NT 5.1; rv:10.0.2) Gecko/20100101 Firefox/10.0.2
and start a job, OE will go nuts and finish a job too early, without any good results, while using the default IDs allows OE to run until it... crashes. I haven't never yet managed to see OE to actually complete a job.. it only crashes..
Okay, I ran the same job, 25h, no crashes yet!
I'll do 2 more jobs. If it survives 2x24h more, then it's 99.9% fixed (- because my real goal is still actually 2 weeks in a row for 1 job, but I just don't have any jobs like that in March. Maybe in the next month I'll be able to finally do the ultimate spider test...)
But in the meanwhile, I found something else, interesting...
Did you take a look in "Agent identification" module? Since it seems to be defective too.
I was using my own Firefox v10's identity and I started a spidering job, but it was so short - OE quickly ran out of pages to spider.
I then had a thought, that I should try the default Identify as "FireFox 5.0" identity and voila, OE successfully spidered my target 'til completion. What's even worse, the target website doesn't even require any specific browser!
This has happened to me twice already. First time I was using my IE8's identity.
It doesn't seem to depend on text length, it just dislikes all modification to the current default preset or custom input in "Use this identification" box. Weird. I'll run few more jobs.
Also, the "Downloaded" info (on the Connections Panel) is also incorrect, or is it supposed to show such numbers for "bragging"?
Since I'm using filters and downloading only specific content, the actual download folder's size is smaller and there are less downloaded files inside than displayed after "Downloaded".
Identification - can you please give me details on how to reproduce this? For example, site URL, Level, other importantt settings. Agent 1 gives certain amount of pages, Agent 2, another amount, etc.
I agree about the Downloaded - I also faced this several times. I will work to understand why it happens and will try to fix.
I set a custom identification there,
My Firefox 10's identity:
Mozilla/5.0 (Windows NT 5.1; rv:10.0.2) Gecko/20100101 Firefox/10.0.2
clicked on Apply, closed the Options window,
but every time I check back there, that setting, "Identify as" radio button is still selected.
Also the "Identify as" box under Internet tab had the previous value of my last default preset selection.
That doesn't seem logical.
So, I selected a default preset under "Identify as" "Internet Explorer", then clicked on
"Use this identification" radio button and then OK.
This time OE accepted it and changed the value under under Internet tab to Internet Explorer?s identity info
and also, when I checked back at its settings, "Use this identification" radio button stayed selected.
Then I selected "FireFox 5.0" preset and clicked on "Use this identification" radio button, then OK. It didn't stay, went back to Identify as FireFox 5.0.
Then I inputted my Firefox's identity again and pressed on OK - OE then reverted that setting to "FireFox 5.0" preset.
You know, what's even more bizarre? I then tried with my IE 8's identity, which is:
Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1) ; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; .NET4.0C; InfoPath.2; .NET4.0E)
and I tested also with my Google Chrome's identity:
Mozilla/5.0 (Windows NT 5.1) AppleWebKit/535.11 (KHTML, like Gecko) Chrome/17.0.963.56 Safari/535.11
OE accepted them both. So it only dislikes that Firefox's identity.
Besides IE and FF presets, I haven't checked other presets, they may be bugged too.
I still have yet to encounter the issue, where spider rapidly runs out of pages to crawl with a custom, non default preset's identity set.
(Maybe the identity code has something that tells OE to download the mentioned URLs only and skip further spidering, when the website is built with a specific code?)
I can't foresee the future, but I keep on running jobs. I may not even encounter this anytime soon, but I will restart my spider list after I finish current list and if it's still not fixed in next OE version, then I may definitely encounter that bug again.
And congrats, after 2 jobs I haven't got a single crash!
I think I don't need HTTrack anymore...
But, I still have a lot of jobs to run this month and maybe in next month I can run jobs longer than 24h, then I can finally do the ultimate survival test with OE.
RAM consumption... will OE survive 2 weeks without running out of memory? Is the memory management module built intelligently enough to survive?
Regarding running the downloads during two weeks - I hope it will work well because of the updated memory manager. Please keep me informed on how it works!
Options > Files > Disk free space limit
Why doesn't it allow any number higher than 10000?
For end-user's comfort, it should allow any number that is logical for currently selected hard disk and its condition.
Target: kisuki.net + Load up to 2 links on other servers
Run time 'til crash: 3h
The same error messages about Kernel32.dll and EThread.
In this job I thought I change Link Translation setting from No Translation to Offline Translation - that may have been the cause. I'm rerunning the job with No Translation this time to see if it survives 'til morning.
Btw, I noticed that even if the crash occurs, OE keeps on spidering and only terminates, when I Close the message, which says that OE wants to terminate. If I ignore that message, then my screen gets a new set of same error messages every time it steps on a "mine" or maybe it still has something to do with reaching memory high peak.
Crashed again. The crash doesn't depend on Link Translation. Kisuki.net or a site it links to has something specifically that makes OE crash. It crashed again after 3h, so the "mine" should be close.
This time. there was no crash message, but it hanged and didn't allow me to open it.
I let it run for a bit longer after hang start, thinking maybe it finishes whatever it was doing, but after 1h, it still hanged, and displayed no error messages.
Memory usage was 98MB and CPU usage at max at hang start (I used Process Hacker, a Task Manager replacement tool to view info), but after 1h, it stayed on 99.62MB with no CPU usage.
There were no active connections, while was hanging.
So it's not the fault of running out of memory, I thinks, but it may still be a logical processing "mine" somewhere at Kisuki's place..
I get different crash results every time I do this job.
Same amount of time, hanged, no error messages, but this time, OE ate up to 1.84GB of memory, then at 1.85GB it hanged.
For this, OE's window needs to be Maximized and then closed to tray.
Then whenever I start a full screen app, e.g. a video game, OE pops up, minimizing the full screen app window.
Then click on X to close OE back to tray and then try opening the full screen app again, OE pops up again instead of the full screen app.
When it has done so, OE's Minimize button doesn't function, you can then either close it back to tray or let it stay open.
This doesn't happen, when OE is not in Maximized mode.
It seems to have something to do with full screen app changing current screen resolution.
I will test the full-screen issue to find out what's wrong. Thank you!
No idea if it's any help, but...
VS2010's Debugger says:
Unhandled exception at 0x7c8024f0 (kernel32.dll) in OE.exe: 0xC0000005: Access violation writing location 0x00040e80.
Call stack location:
kernel32.dll!__SEH_prolog() + 0x1a bytes
OE's RAM consumption: 1.75GB
CPU usage at max while crash.
I've read about 0xC0000005 error code..
Bad RAM? No.
Trouble in registry? Maybe, don't know where to search and how to fix anyway.
Incompattibility with DEP? OE, always triggers DEP, when OE is not in DEP's exclusion list.
Not using SafeSEH? ... .
I've also encountered Runtime error 204 at 079E24D8 several times, when starting the job on previous SWF sites, but then I could just set OE to Download again and that message wouldn't appear again.
But sadly I still have no idea, how to reproduce either. Unless you do the same job, with my settings and just wait. I hope this is not hardware specific problem...
So it's not Kisuki, but an external site it directs to or a site the external site directs to.
But it can't be far, since the hang or crash happens always after 3rd hour and before 4th hour on my 12Mbit internet.
It could be easier to identify any bugs, if your program had special bug analyzing module. I'd be glad to help you in that advanced way.
You can post the Project settings here - select it, press Ctrl+C and paste it to the forum message.
Okay, on my last SWF project, I got a brand new error message:
"Not enough storage is available to process this command."
...and then came the crash message.
"Not enough storage..."? If it's about RAM, I forgot check the RAM consumption, so I can't tell, how much it ate...
But this couldn't be hard disk space, because I have over 180GB of free space.
I checked my Event Viewer...
Faulting application oe.exe, version 18.104.22.16834, faulting module kernel32.dll, version 5.1.2600.5781, fault address 0x00012afb. - fault address was the same on my 2 last projects - running out of handles?
Anyway, Boyis.com settings - Settings I use to grab SWF files:
I did a series of tests and indeed, the culprit at Kisuki was "Explore Flash video links" - if that option is off, then hang/crash won't occur!
I will now test Boyis to see if it survives 24h without that option.
At Boyis, with the flash parser turned off, I now get 2x "Runtime error 204 at 070324D8" and then hang.
I will do a final test without the Evaluate script calculations to see if that will allow this job to complete with success.
I clicked on Stop
Thread Error: The handle is invalid (6)
Clicked on OK, then on Stop again, the same message appeared.
Impossible to Resume/Start the job - OE's spidering engine hanged.
So I closed the program and I got that message again, but OE closed successfully tho.
So, both flash parser and eval scripts are defective, and not only those, there's also the 3rd thing and maybe even more.
All errors seem to have something in common: thread and handle error.
I really wish I knew, how to debug errors...
Thank you very much for the tests! I will verify the script and video parsing code again.