How to download a page which loads additional content while scrolling.
|Jeff||05/15/2016 08:39 pm|
I would like to download the images linked from following page. http://imgur.com/r/carporn/new.
The problem is the page only loads as you scroll down while viewing it so I get an incomplete download. I want to download around the first 10 pages of these car pictures for a wallpapers. So I looked around this forum and found some other people asking similar things. There were suggestions like this...
"There would be another way to download such sites in 7.0 version - you would be able to browse a web page, load all its elements, scroll it down and then save with all links - exactly what you see online in the Internal browser."
But I cannot figure out how to get that to work. Can you tell me if this is possible and, if so, some detailed instructions about how to do it? i certainly don't mind having to open the page and scroll down in the internal browser. I just can't figure out how to get the images to save once that is done.
Thanks in advance,
|Oleg Chernavin||05/15/2016 08:41 pm|
I planned to add the auto-scroll feature. I will work to implement it this week. Is this project urgent for you?
|Jeff||05/16/2016 01:54 pm|
Glad to hear from you. I know the issue is in good hands now. :-)
And I appreciate you asking. This particular project is not urgent at all. Although I will say this a feature I have wished I had on many previous occasions as this type of page loading functionality seems to be becoming more widely used.
|Oleg Chernavin||06/04/2016 06:43 pm|
I implemented the auto-scroll feature and you can now specify up to how mny times to attempt to scroll.
If the page height stays the same after the next scroll attempt, the page will be saved. Or when attempts are over.
Here is the new version to test:
Please let me know how it works. Thank you!
|Jeff||06/04/2016 10:15 pm|
Thank you very much. It does seems to work. However there is an unanticipated side effect in that it looks like it tries to open and scroll every single linked page in the download. I was expecting it to just open and auto scroll for the URLs specified in the "starting web address" box. Doing it for every linked page seems to create a lot of unnecessary overhead.
Could the open and auto scroll be limited to just the URLs listed in the "starting web address" box? Or even better, could there be a "levels" choice that would limit how many links deep it tries to load and auto scroll the pages?
Again, thank you very much for your work on this.
|Oleg Chernavin||06/05/2016 10:34 pm|
|OK. I will consider this for future versions.
Can you please tell me what overhead happens for other linked pages? Or a page doesn't load additional contents after scrolling, it would not influence the download.
|Jeff||06/06/2016 11:49 am|
|It just loads each linked page multiple times. So if you try a site like this http://imgur.com/r/carporn/new with the level set to 1, it will load each linked page multiple times trying to scroll each one. But the linked pages are simple pages which don't need to be auto scrolled. So the result is the download takes much, much longer.
The difference for that example was 17s using the internal download code versus over 2 minutes using the open page and auto scroll. And that is for a very simple example. For much larger downloads, that difference will grow massively. So it would be very helpful to be able to control how many levels deep it tries to open pages and auto scroll.
This does work, so thanks again. It's just a suggestion that it would be much more useful if there was more control over which pages opened and scrolled.
|Oleg Chernavin||06/06/2016 08:07 pm|
|Yes, I see. I will plan to add them. It will require some redesign in the Properties dialog to find more place for them.
|Jeff||06/06/2016 08:08 pm|
I just did some more testing and it actually isn't doing what I thought. So it's not actually working. By reloading the pages, it's just starting over again and not getting the additional pages that load when you scroll down.
Try this page instead as this will show the problem better and be easier to work with. http://imgur.com/r/carporn.
First load the page in the internal browser. Then hit CTRL-END. Wait second and you will see it load more images. Then hit CTRL-END and wait a few more times. You will see it adds multiple pages of images to that first page.
So this really doesn't need to reload the page multiple times. It just has to do a send keys to go to the bottom of the page and wait a few times and then save the page and links.
And again, if it could just do that for the pages specified in the "starting web address" box, that would be ideal. And this would probably be a lot more efficient and easier to code.
If you have any questions, please feel free to ask.
|Oleg Chernavin||06/06/2016 08:11 pm|
|Well, I designed it in another way. No reloading, but just scrolling down.
I just checked it on your site - it scrolled down 10 times and started to save. No redownloading at all.
|Jeff||06/07/2016 01:16 am|
|That sounds perfect. Let me know if you have something you want me to test.