Links containing an ampersand

Author Message
Daniel Lacroix 06/21/2004 08:02 am
Did you know that links within a page which had "&" in them, are being stripped to only a "&".

Since I am using your software for validating our pages, these pages no longer validates...

Daniel.
Oleg Chernavin 06/21/2004 08:08 am
Daniel,

Can you tell me few examples of such links? In most links these &-amp; combinations should be converted to & symbols. Perhaps there are some rare examples where this conversion should not be done.

Thank you!

Best regards,
Oleg Chernavin
MP Staff
Daniel Lacroix 06/21/2004 08:36 am
Oleg,
I am talking of any link passing more than one parameter. We are coding them with &-amp; for page validation purposes.
Here is an example: http://www.cio-dpi.gc.ca/cio-dpi/index_e.asp.

When this page is downloaded the &-amp; in links are being replace with &, which no longer validates.

Daniel.
Oleg Chernavin 06/21/2004 10:03 am
I just loaded that page with its links. The two links under "Find Information" contain & symbols. After downloading the site the &-amp; symbols were converted to & symbols in the links, but they work well offline when I am browsing them offline within Offline Explorer and directly from disk.

Can you tell me, what kind of problem do you experience?

Oleg.
Daniel Lacroix 06/21/2004 10:43 am
Oleg,
I am running page validation in batch mode on the files downloaded with OEP.

Looking at the page source, the &-amp; are converted to an &, in links with multiple parameters, on the pages downloaded with OEP.

I am wondering, if the original page source had links coded with &-amp;, can the downloaded page source of OEP have them?

Daniel.
Oleg Chernavin 06/22/2004 07:31 am
I see. Is it urgent for you to have this corrected? Would a month be too long?

Oleg.
Daniel Lacroix 06/22/2004 04:03 pm
Great.
I can wait a month.

In the meantime, I have look at a previous web site download and notice OEP use to put the &-amp; in links. Something must have change with the newer releases.

Daniel.
Oleg Chernavin 06/23/2004 02:08 am
Thank you! I am travelling now. I will return from my trip two weeks later and I will work on it. I will E-mail you directly once I have results.

Oleg.
Daniel Lacroix 08/19/2004 03:21 pm
Oleg,

Is this request still on your todo list?

Thanks.

Daniel.
Oleg Chernavin 08/20/2004 08:21 am
Daniel,

I reviewed the source code related to the links change when translating them offline. It looks like it will be quite hard to make &-amp; symbols preserved. At least, in the cases when some symbols in URLs are & and others - &-amp;.

If it is really important, I can add an option to change all & symbols in offline links to &-amp; - even those which were not coded this way online. It should work for this site.

Oleg.
Daniel Lacroix 08/20/2004 09:28 am
This would not help,
since I am trying to find where in links the &-amp; are coded as &.

OEP 2.9 used to keep the &-amp within an offline or online link on the dowloaded page source.

Daniel.
Oleg Chernavin 08/20/2004 09:35 am
Yes, but many things were changed and optimized since 2.9 version. It will take much time and efforts to revert back.

Oleg.
daniel lacroix 01/20/2011 10:06 am
Hi Oleg,

I wonder if you could have another look at this.

thanks.

Daniel.
Oleg Chernavin 02/08/2011 02:13 pm
Sorry for the late answer. I tried the link above to have it as a test page, but the site doesn't respond. Do you have some other page, so I could work on this option?

Thank you!

Oleg.
Daniel Lacroix 01/26/2012 11:44 am
Hi Oleg,

When EXPORTING a project with the KeepAMP, I notice the following:

From this page
http://www.expenditurereview-examendesdepenses.gc.ca/as-se-fra.asp

The link under the Recherche black button is keeping the & characters but
the second link in the sidebar are being removed while exporting.

It seems that if the href contains the "http://" characters then the amp; characters are kept, but if it's a relative link like the one for the sidebar, they are being removed.

Can you have another look.

Thanks.

Daniel.
Oleg Chernavin 01/26/2012 01:20 pm
Actually, I didn't make the keep amp option for the export. Do you need it?

Oleg.
Daniel Lacroix 01/26/2012 01:29 pm
Yes, this would be helpful in our page validation process.

Danie.
Oleg Chernavin 01/28/2012 11:53 am
OK. I made this. Here is the updated oe.exe file version:

http://www.metaproducts.com/download/betas/OEP3720.zip

Oleg.
Daniel Lacroix 01/30/2012 09:45 am
Thanks Oleg,

I tried the updated version on a small site and it's working on the downloaded files as well as on the exported files.

Thanks again.

Daniel.
Oleg Chernavin 01/31/2012 05:27 am
OK. Very good!

Oleg.
Daniel Lacroix 04/12/2012 11:01 am
Hi Oleg,

I just notice that when using the KeepAMP as a command parameter, the exported pages will keep the amp; but the actual link that contains them, will no longer get the .htm appended to them, for a downloaded page.

I can send you directly a link to a small web site for testing.

Daniel
Oleg Chernavin 04/12/2012 12:40 pm
Yes, please!

Oleg.
Daniel Lacroix 04/13/2012 08:40 am
Oleg,

I have received and tested the newer version you sent me, and it's working.

Thanks again for your great support.

Daniel.
Oleg Chernavin 04/13/2012 12:39 pm
You are welcome!

Oleg.
Daniel Lacroix 04/18/2012 10:57 am
Oleg,

I now notice that if the URL with a query string is longer than about 175 characters, the downloaded pages gets truncated and appended with a series of 10 characters or digits.

This can be tested with the same domain I forward you before.

Can you have a look.

Thanks.

Daniel.
Oleg Chernavin 04/18/2012 11:04 am
Daniel,

Yes, this is normal. Windows doesn't allow to create very long filenames. So, Offline Explorer truncates them and adds a unique code to make sure the truncated links are different.

This feature exists for many years in Offline Explorer.

Oleg.
Daniel Lacroix 04/18/2012 11:18 am
Oleg,

It seems the link within the project page are having a different unique code for those long URL's.
Possibly cause by the extra "amp;" characters in the URL itself.

Daniel.
Oleg Chernavin 04/18/2012 12:17 pm
OK. Fixed that. Here is the update:

http://www.metaproducts.com/download/betas/OEP3767.zip

Thank you!

Oleg.
Daniel Lacroix 04/18/2012 01:14 pm
You are good! It works.

Thanks again.

Daniel.
Oleg Chernavin 04/18/2012 02:27 pm
You are welcome!

Oleg.
Daniel Lacroix 04/19/2012 02:59 pm
Oleg,

I notice the following:

A page would download without being recoded with a unique code when it's filename would be short enough. But then when it is a linked page that contains the amp; then the linked will get recoded with a unique code, which would make it unbrowsable.

I was able to reproduce this behavior on the same web site you are using for testing this.

Can you have a look.

Thanks.

Daniel.
Oleg Chernavin 04/20/2012 04:43 pm
Oh, sorry for that!

http://www.metaproducts.com/download/betas/OEP3769.zip

Oleg.
Daniel Lacroix 04/24/2012 10:22 am
Thanks Oleg.

I have found other issues but I wonder if they are created with the way our application works.

I will keep you posted.

Daniel.
Oleg Chernavin 04/24/2012 11:05 am
Yes, please keep me informed.

Oleg.