Bug- Javascript downloaded- wrong meaning to URL starting with "/"

Author Message
ToolmakerSteve 12/26/2004 01:57 pm
This works correctly when browsing the site directly, by entering in OE browser`s address bar:
http://www.imageexpress.com/

But bug occurs when browsing with AutoSave:
1. Create a new project.
2. UNCHECK the Levels box.
3. Enter URL:
http://www.imageexpress.com/
4. Click `AutoSave` icon in browser (it stays on).
5. Right-click project, select `Browse`.
=> ImageExpress home page appears, with address:
http://127.0.0.1:800/Default/www.imageexpress.com/
6. Enter any search term in Quick Search box; e.g.
christmas
& Go
=> http://127.0.0.1:800/Default/www.imageexpress.com/Search.do@ps=25&k=christmas
with thumbnails of christmas pictures.
7. On this page, enter the same search term in "Keywords"; e.g.
christmas
& Search
=> DOCUMENT NOT FOUND, with address
http://127.0.0.1:800/Search.do?p=0&rs=5&cs=122&pk=christmas&bl=%2FSearch.do%3Fp%3D5%26ps%3D25%26rs%3D5%26cs%3D122%26k%3Dchristmas%26ori%3D0%26cspc%3D0%26clns%3D1%26pk%3Djewish%26sir%3Dfalse&vsImgId=0&k=christmas&clns=1&ty=0&ps=25&x=34&y=5

Whereas doing this directly, at address:
http://www.imageexpress.com/Search.do?ps=25&k=christmas
Keywords "christmas" & Search
=> http://www.imageexpress.com/Search.do?p=0&rs=5&cs=122&pk=christmas&bl=%2FSearch.do%3Fp%3D0%26ps%3D25%26rs%3D5%26cs%3D122%26k%3Dchristmas%26ori%3D0%26cspc%3D0%26clns%3D1%26pk%3Dnull%26sir%3Dfalse&vsImgId=0&k=christmas&clns=1&ty=0&ps=25&x=49&y=12
with those same thumbnails of the page you were already on.

What went Wrong:
Javascript contains a URL starting with a `/`:
/Search.do
The URL parsing when downloaded interpreted this as starting at the root of the INTERNET folder:
http://127.0.0.1:800/
What it should have done is interpreted this as starting at the root of THIS WEBSITE`s folder
http://127.0.0.1:800/Default/www.imageexpress.com/
ToolmakerSteve 12/26/2004 02:08 pm
Unfortunately, the ONLY way to browse Hemera`s Image Express is via these search terms -- so OE is almost useless at this website.

Well, I was able to make SOME use of it, by repeatedly going back to the home page, to enter search terms there, where it works. By doing so I was able to manually build up a list of search pages of terms that I use - thank goodness that the "next" button on the search results didn`t hit the same bug!

However, this is extremely inconvenient compared to how one normally uses ImageExpress, which is to do one search term, then pick one photo that looks interesting. That photo will have a list of the search terms it is on - clicking on one of those search terms immediately does THAT search - so one can hyper-navigate around the search space just by clicking. UNFORTUNATELY those keyword links go through the same bugged logic as I describe, so instead I had to copy a keyword & go back to the home page, if I wanted to capture a search on my hard drive.

ToolmakerSteve 12/26/2004 05:16 pm
Once this bug is fixed, OEP will be absolutely invaluable when browsing this website -
which is normally laborious to work with, because it is completely search based:

Just click around [browsing with AutoSave] until found all the search terms you care about,
and then edit the list of SingleURLs to remove any unwanted
[ImageExpress has a limit of 1000 image downloads per day],
then set Levels appropriately & download.

Work-around:
I am managing okay now, by manually browsing in Internet Explorer; once I decide I want a keyword, I switch over to OEP, enter it on the home page & go there [browsing with AutoSave] -
this builds up a list of SingleURLs with wanted keywords, that can then be easily downloaded.

So having figured out this work-around, I take back my comment about OEP being almost useless w.r.t. this website!
Oleg Chernavin 12/27/2004 06:14 am
Well, this problem with / in the form can be easily avoided by enabling HTML Forms processing in the Properties | Advanced section.

Best regards,
Oleg Chernavin
MP Staff
Steve Shaw 12/27/2004 11:20 pm
No, setting "Explore HTML Forms" has no effect.

Please, I gave you a careful recipe to reproduce the bug. Did you follow those steps, to see it for yourself?

The URL that gets added to the project is:
SingleURL=http://Search.do?ps=25&k=christmas

This is a bug. What the URL should say is:
SingleURL=http://www.imageexpress.com/Search.do?ps=25&k=christmas

What went wrong: the logic for parsing the browsed page is incorrect, for URLs that start with a "/".
Steve Shaw 12/27/2004 11:28 pm
And examining the HTML source for the page, I notice that the URL was formed by Javascript. That may be an important clue to what went wrong.
Oleg Chernavin 12/28/2004 05:05 am
Can you post that HTML source piece here? I examined the HTML of that page and I saw that the search form is just an HTML Form with no javascript. Maybe I was looking at a wrong place?

Oleg.
ToolmakerSteve 12/28/2004 01:07 pm
Sorry, I mixed up two different problems in my mind when I said this one had to do with Javascript.

Its simply an action URL that starts with "/".

HTML:
<form name="searchForm" method="GET" action="/Search.do" onsubmit="setpaging();">

BUG: OE didn`t prepend the current server ["www.imageexpress.com"], on that action URL starting with a slash ["/Search.do"], to obtain the URL ["http://www.imageexpress.com/Search.do"].
Oleg Chernavin 12/28/2004 01:15 pm
Yes, this form action problem gets fixed by setting the "Explore HTML forms" box in Properties | Advanced. You will need to redownload the page(s) to make links changed.

Oleg.
ToolmakerSteve 12/28/2004 01:54 pm
I understand - but did you actually try this, to see whether it worked on your computer?

I set Properties / Download All Files.
I checked "Explore All HTML Forms".

The problem persists.
ToolmakerSteve 12/28/2004 01:57 pm
Oh, and note that this is with "Levels" disabled, and with "AutoSave" active in the browser pane.

I`m going to try deleting the old downloaded files completely to see if that helps [though "download all" should mean this isn`t needed, right?]

ToolmakerSteve 12/28/2004 03:15 pm
There`s something peculiar about "browsing with AutoSave" and URL parsing.

I`ve now experimented on three different computers, installing OE Pro on each.
All computers are running Win XP with Service Pack 2, and all current critical patches.
A & B have antivirus software, C does not.
Different file locations were set on the different computers.

I`ve gotten a total of four different results, by slightly varying what I was doing, or which computer I was using.

On Computer A [my main PC], I was seeing the problem I described [search worked on home page; failed when searched from the search page].

However, this site now works perfectly on Computer A - after deleting the project and its files. (I also closed all MSIE windows, and cleared all COOKIES -- don`t know if this helped.)

On Computer B [my laptop], turning on AutoSave, then saying "Browse", showed a list of the downloaded files for that website. That is, there had been a previous project that had downloaded some files. Instead of downloading the "index.jsp" page that is the home page for this site, it acted like Windows Explorer showing a folder. Hmm, maybe I forgot to click on "AutoSave".

After deleting that old downloaded website, On Computer B, I can now see the home page, but entering any search term yields that "/Search.do" incorrect URL happens -- something I haven`t seen happen until the second page. Nothing I do improves the situation.

On Computer C [used for backup & testing - has a fresh install of Win XP + SP 2], again I get the incorrect search "/Search.do" on the home page.






ToolmakerSteve 12/28/2004 03:31 pm
Even weirder:

Now on Computer A, the search on the home page works, but attempting to search from the search page, yields "Page Not Found". Its as if AutoSave is NOT active -- but it looks like its active (the blue floppy icon is pressed).

Then I looked at the project, copied the URL from the Single URL line that was added to do this search, pasted that into a new OE Browser window & went there.

So there`s nothing wrong with the URL when used in the address bar. And I double-checked that "Explore HTML Forms" is set. (Actually, I already knew it was set, because the Single URL that was added to the project had prepended www.imageexpress.com).

http://www.imageexpress.com/Search.do?p=0&rs=5&cs=122&pk=kiss&bl=/Search.do?p=0&ps=25&rs=5&cs=122&k=kiss&ori=0&cspc=0&clns=1&pk=null&sir=false&vsImgId=0&k=kiss&clns=1&ty=0&ps=25&x=45&y=12

- - - - -
I have been careful to verify that "Explore HTML Forms" is set each time on each computer.

Oleg Chernavin 12/29/2004 04:53 am
I worked on that and I have found that setting the Explore HTML forms correctly changes the form action URL when using the AutoSave. But OE sometimes starts loading weird URLs. I used stop and browse from the Map as a workaround. I will investigate this further and try to fix it.

Oleg.
ToolmakerSteve 12/29/2004 02:13 pm
Try AutoSave browsing of a project pointing at
http://www.hemera.com/

[WITHOUT having previously downloaded that page.]

I see that "blank image" I mentioned in another thread.

Watching URLs go by with a delay set, I see that again this involves URLs starting with "/", being requested without the prefix "www.hemera.com".

NOTE: This is without setting "Explore HTML Forms".
ToolmakerSteve 12/29/2004 02:20 pm
> [WITHOUT having previously downloaded that page.]

OOPS: WITHOUT having previously downloaded the index page, what gets shown is a "Windows Explorer" view of the files in that folder.

WITH having downloaded, so there is a "default.htm", the symptom I describe is shown.

Maybe what`s going on is that the "www.hemera.com" got stripped, because of confusion betweeen references to "www.hemera.com" and references to "www.hemera.com/default.htm". Perhaps OE was stripping the page name, but there was no page name to strip? Hmm. Or since I saw this in MSIE downloaded file, maybe the bug is in MSIE or javascript.

ToolmakerSteve 12/29/2004 02:57 pm
Discovered true name of home page is "index.jsp".
Tried a new project with URL
http://www.hemera.com/index.jsp

It behaved the same; so doesn`t matter whether the home page name is given or is omitted [http://www.hemera.com/] [or even http://www.hemera.com -- no slash at end]

Here is the precise behaviour, on Computer A.

In MSIE address bar:
WORKS: both large images present.

In OE address bar:
FIRST large image MISSING [this is rotated via javascript],
second large image present [this is static].

In OE, via Autosave browsing:
BOTH images MISSING.

And in OE, downloading - well, set Level 0, and after downloading the first few URLs for the page with a delay of 1, pause, and look at the URLs that get listed.

A few weird URLs. Examples:
http://www.hemera.com/://media101.sitebrand.com/v3/://media101.sitebrand.com/v3/envoy.sb?sbaid=765811&sbbn=

http://www.hemera.com/images/home/promos/://media101.sitebrand.com/v3/envoy.sb?sbaid=765811&sbbn=

And I don`t see ANY mention of a reasonable URL for that first image; unless it was one of the weird ones, and I just didn`t recognize it.

Hmm. And I don`t think those ..."status=Done;separatorsize=1" URLs that eventually appear should be asked for at all: I think those should be "Level 1", not "Level 0" - but I verified that I have "Level: Limit 0". Or maybe its sloppy coding by the author of the common code used by this page, and the home page is making references that would only make sense inside the membership area...
Oleg Chernavin 01/05/2005 10:47 am
Both weird URLs are from a script that calculates them from several parts. I will try to prevent these URLs from downloading.

Oleg.