javascript links

Author Message
Joane 05/20/2010 06:41 pm
It has become more of a rule then an exception for web masters to thwart crawlers with javascript links. OEE has always been great at parsing javascript links. But recently I started to come across more and more sites, that use these malignant javascripts, where everything fails.
I'll give the following example.
1. The script behind the button for turning pages in a book is simple and looks like this: onclick="return nextpage('previouspage');".
2. OEE parses it easily on the starting page.
3. URLs for all the pages are simple (i.e. http://somesite.com/page{:1..300}.html).
4. But you can't access these pages until you clicked the button (obviously some kind of counter (maybe a cookie) changes when click it).
5. Simply downloading the starting page again does not work because it resets the counter to the first page.

So, is it possible to make OEE parse a set of arbitrary javascript links?
Say a list:
onclick="return nextpage('1');"
onclick="return nextpage('2');"
onclick="return nextpage('3');"
...
all one the same page that is open in the internal browser.
Maybe it could be implemented through OLE automation in future?

Thanks. Joane.
Michel 05/21/2010 04:42 am
If would be awesome if scripts could be invoked from the project. OE already knows how to handle javascript so it seems like a natural extension of its capabilities.
Oleg Chernavin 05/21/2010 06:10 am
First of all, please allow script calculations in the Project Properties - Parsing section. There is a chance that this code will work.

However the following situations cannot be handled yet - when a script changes an HTML Form and submits it (except _doPostBack links, which OE can handle pretty well). And AJAX requests.

Regarding invoking the scripts - what do you mean?

Best regards,
Oleg Chernavin
MP Staff
Joane 05/21/2010 10:15 am
Thank you for your reply, Oleg.

It turns out that javascript is responsible for making AJAX requests. After investigating a little more I found out that in POST requests values are delimited by 0x0D or 0x0A character (not sure witch one).
Is there a way to make a POST request with no parameters but only values delimited by 0x0D or 0x0A? (I always used POST only like this: POST param1=value1;param2=value2).

Thanks. Joane.
Joane 05/21/2010 12:11 pm
Basically I'm talking about sending Ajax POST request with OEE.
Oleg Chernavin 05/21/2010 12:38 pm
You may supply the request in the URLs field of the Project this way:

http://www.server.com/file.asp
POST=param1=value&param2=value2

It doesn't support 0x0A, 0x0D symbols yet, but if necessary, I can add this.

Oleg.
Joane 05/21/2010 01:40 pm
Thanks for the advice.

Adding support for 0x0A and 0x0D characters would be really great.
But I think that there are some other things that should be considered so that OEE could make different kinds of POST requests. If you are interested I could send you an email with my ideas and the types of Ajax requests that I have encountered.

Regards. Joane.
Oleg Chernavin 05/21/2010 02:04 pm
Yes, I am really curious. This AJAX thing bothers me for a few years already and I still have no ideas at all on how to handle it. Maybe your E-mail would give me some hints. I will write an E-mail to the address you left in this forum now.

Oleg.