problem with interstitial ad redirect and javascript link construction

Author Message
Lars 07/16/2015 02:44 am
I'm attempting to archive the following URL: "http://www.gamebanshee.com/icewinddale/" but I am having a problem with the redirection from an interstital (or prestititial) ad which of course I don't want to have to go through offline and a dynamically constructed link to images in a directory which OE doesn't seem to be able to parse. The images are actually located in this directory on the server: http://www.gamebanshee.com/icewinddale/equipment/images/. An example image that javascript is trying to construct the link for on the fly (with the inclusion of a served ad which is the whole point of this dynamically served link) is: http://www.gamebanshee.com/icewinddale/equipment/images/bootsofgrounding.jpg. That whole equipment/images subdirectory is not present in the OE tree. The URL for the popup window that displays the image is actually: "http://www.gamebanshee.com/showshot.php?/icewinddale/equipment/images/bootsofgrounding.jpg"

Is there at least a workaround for downloading and properly linking to those images? There are a large number of them so manually downloading them all would be difficult, but is there a way to directly change the link in the parent to go directly to the child image in equipment/images?

As far as the interstitial ad the problem is not even only the annoyance of having to view that page offline, but if an internet connection is required to get past that well that is what I may not have at all in a few weeks which is the point of maybe buying OE and archiving parts of sites like this. I have to be able to navigate the offline site without any internet connection.
Oleg Chernavin 07/16/2015 05:15 pm
I think, there are several ways to deal with it. A very simple one is to browse the downloaded site with the Browse With AutoSave button. This way, all missing links and images will be loaded and stored on the disk and will be available offline.

You may also try to enable scripts calculations (Project Properties dialog - Parsing section) - it may also work while downloading the site to get more links.

Or we may use URL Substitutes feature to modify the links or scripts to disable them.

Please try one of the first ways. If they would not work out, please give me more details about where exactly to look at the ads and images and I will see what else could be done.

Best regards,
Oleg Chernavin
MP Staff
Lars 07/16/2015 11:05 pm
I cannot get the autosave feature to work because I cannot browse any of the site offline. It seems to just get stuck at the initial interstitial ad. On the real web site there is an option to skip the interstitial ad or just wait 30 seconds and after that the link on the site you wanted to go to is allowed to load. The interstitial ad and redirect page is at http://www.gamebanshee.com/interstitial/interstitial.html.

I think a cookie is set after you get through the interstitial page once that allows you to browse the site, but after that expires you cannot browse to any link on the site without being redirected to the interstitial again. I believe this ad has something to do with http://ads.intergi.com.

I tried enabling that parsing option but it still did not seem to parse the dynamically created links and download the images from the directories /icewinddale/equipment/images/ and /icewinddale/spells/images/. Those links are dynamically generated along with some kind of ads by "http://www.gamebanshee.com/javascript/showshot.js" which creates the popup window to display the image in.

I'm not sure how to use URL Substitutes. The idea is to replace one part of a URL with a chosen string? I thought about trying to substitute the dynamic URL of something like "http://www.gamebanshee.com/showshot.php?/icewinddale/equipment/images/bootsofgrounding.jpg" with the static URL of the actual image which would be: "http://www.gamebanshee.com/icewinddale/equipment/images/bootsofgrounding.jpg". Basically the 'showshot.php?' can be deleted and the result will be a valid link to the actual image.
Oleg Chernavin 07/17/2015 06:38 am
OK, it should be easy. Please create a URL Substitutes rule:

URL:
*
Replace:
showshot.php?
With:
(keep this field empty).

Oleg.
Lars 07/17/2015 08:42 pm
That worked to download the images but the parent php pages still contain javascript popup code to relative paths instead of direct html links to the relevant images. And the popup code is not finding the images. So in order to actually have the pages link to the images I'd have to convert all of the javascript popup links to standard html links. Maybe a mass search and replace program could do that or I could maybe write a program to go through and do that. Is there a way to accomplish this from within OE? It appears that just by using javascript links instead of html sites like this are easily able to thwart offline browsers like OE from mirroring their sites effectively. I guess javascript parsing is a difficult problem.

There is still the issue of the interstitial ad redirection that seems to stop offline browsing. I tried just deleting the javascript file http://www.gamebanshee.com/interstitial/interstitial_gbs.js that checks for the cookie 'interstitial_gb' and redirects to http://www.gamebanshee.com/interstitial/interstitial.html if not present and that did seem to work for the moment. I'll have to see if it really is that easy. I am wondering what is calling the interstitial_gbs.js function. Just eliminating that call seems like it would be a better way.

I also would like to see if I can supply the interstitial_gb cookie that the site looks for when OE tries to fetch pages in order to skip the interstitial page entirely through the URL parameter "Cookie=interstitial_gb", but where does the actual cookie have to be for the web site to check its date etc? Will browsing the site with Internet Explorer place the cookie in the right place?

Lars 07/18/2015 12:24 am
Well I let OE complete its downloads and then exported it to an mht file with the option of disabling popups and it seems the search replace is not necessary. The images, at least the ones that were downloaded properly, seem to open in a popup window in firefox just as in the real site. It also seems to work when I right click on the highlighted project and browse either in the internal browser or in firefox.

Simply deleting the .js ad serving javascript files seems to have solved the problem with the interstitial ad. I also blocked them from being downloaded. OE seems to have worked beautifully in fact with that difficult site. I'm impressed.
Oleg Chernavin 07/18/2015 07:32 am
I am glad that you found a way to make this site work offline. Anyway, the answers to your previous questions:

There is no such search/replace feature in Offline Explorer. Only during the download using URL Substitutes.

Yes, heavy scripts that calculate links or use cookies are very hard for automatic download. Offline Explorer is capable to overcome most of such issues, but not all of them.

This http://www.gamebanshee.com/interstitial/interstitial_gbs.js file doesn't have a function. So, removing the HTML code like:

<script src="/interstitial/interstitial_gbs.js"></script> should work. You may find all cases of this file using the Search Contents button, type interstitial_gbs.js in the search field, uncheck "Only in text files" box and check the "Inside HTML tags" box.

Offline Explorer uses MS Internet Explorer cookes. The best way to get them is to open the site in the Internal Browser - this sets them all before the download.

Oleg.