Unicode URLs

Author Message
waterliner 08/18/2008 07:30 pm
Hello.

How to download website which urls and filenames contain Unicode symbols AS IS?

%D0%92%D0%B5% - no!

E.g. wikipedia projects or others.
Oleg Chernavin 08/19/2008 10:16 am
Yes, I tried to download the following page:

http://ru.wikipedia.org/wiki/%D0%9E%D1%84%D1%84%D0%BB%D0%B0%D0%B9%D0%BD-%D0%B1%D1%80%D0%B0%D1%83%D0%B7%D0%B5%D1%80%D1%8B

Offline Explorer Pro 5.1 has no problems downloading it and other links from this page (also in Unicode).

Best regards,
Oleg Chernavin
MP Staff
waterliner 08/19/2008 05:21 pm
Perhaps text on pages & navigation is OK.

But URLs themselves are terrible!
I need all pages'' names and URLs in origin language rather than codes.

As i realize it''s impossible?
waterliner 08/19/2008 05:22 pm
I have to mean origin directory structure.
Oleg Chernavin 08/20/2008 04:50 am
Yes, I understand. The problem is that some filesystems may not be compatible with Unicode filenames. This is why we decided to use only ASCII symbols in generated filenames for better compatibility.

Oleg.
waterliner 08/20/2008 09:51 pm
Very strange. I have read this:
...
OFFLINE EXPLORER PRO DEVELOPMENT HISTORY
19.11.2007 - Offline Explorer Pro 4.9.2670 Release
...
* Improved unicode symbols support in URLs
...
http://www.metaproducts.com/mp/Offline_Explorer_Pro-history.htm?page=2

This feature no longer available?
Oleg Chernavin 08/21/2008 06:05 am
Yes, there was a problem making correct links to such files.

Oleg.
Siggi 08/18/2016 10:42 pm
Hello,

8 years ago and it seems, we have the same problems/bucks.
Yesterday I downloaded some old fashioned web sites (this sites uses i.a. the page identifier AFTER the html tag, like this:

<code>
<html>
<head>
<title>Gallery 1 </title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
</head>
</code>

And this old fashioned pages uses "clear" characters like "|" or "*" in the <a> tag.

The newest OE Pro (7.2.0.4514) doesn't download all pages! More than 50% contents bucks, over bucks (mojibake in html files)! I tried with the option use UNICODE (or UTF8) for URLs and w/o. The same procedure.

Now I use the OE Pro 6.9.0.4208 and it seems, that all is very well :-)

What is the matter with this spooky mojibake?
Oleg Chernavin 08/18/2016 10:43 pm
I will try to investigate this. Can you please tell me a URL of such web page?

Also, can you try to right-click the page while browsing offline and choose a different encoding? Would this fox the issue?

Thank you!

Oleg.
Siggi 08/19/2016 04:48 am
Hello,

it's i.a. a member site with photo and video content: http://www.fetishlady-anja.de/
I found an example, because its bad for you: this html file is in the meber section.

Original file name: a-f-lack0090.html (all of the following codes are interpretations of this same file)

#1) Code as downloaded by OE Pro 7.2.0.4514:
!!!!!!!!!!<CODE>!!!!!!!!!!!!!!!!!
<html>
<head>
<title>Lacquer Gallery =nbCeatiseta http-equiv="Content-Typel content="text/\a-<; charset=iso-8859-1"ati/l>
<he
<body bgcolor="#FFFFFF" text="#000096" background="images/l_3E_3Cer.jpg" bgproperties="fixedl leftmargin="0" topmargin="0" marginwidth="0" marginheight="0" link="#000096" vlink="#000096" alink="#000096"headable width="100%" bor<er="0" cellspacing="0" cellpadding="0">6 i%">6 6 id>&nbsp;ry d">6 /i%"> /idabl"headable width="100%" bor<er="0" cellspacing="0" cellpadding="0">6 i%">6 6 id" algin=ctener"><b%">6 6 6 idable bor<er="0" cellspacing="0" cellpadding="0">6 6 6 6 ir" algin=ctener"> >6 6 6 6 6 id><fcon face="Arial, Helvetica, sans-serifl size="2"><ta ref="a-f-lbac0081./\a-">first</a> | <ta ref="a-f-lbac0089./\a-">previous</a> | <ta ref="a-f-lbac0091./\a-">ntex</a> | <ta ref="a-f-lbac0112./\a-">last</a> | <ta ref="index./\a-">home</a></fcon><b%"<b%"<y d">6 6 6 6 /tr">6 6 6 6 ir> >6 6 6 6 6 id" algin=ctener"><img srcd="imagesa-f-lbac0090r.jpg"<y d">6 6 6 6 /tr">6 6 6 6 ir> >6 6 6 6 6 id" algin=ctener"><b%"<fcon face="Arial, Helvetica, sans-serifl size="2"></fcon>&nbsp;&nbsp;rfcon face="Arial, Helvetica, sans-serifl size="-1"10 of 32</fcon>&nbsp;&nbsp;rfcon face="Arial, Helvetica, sans-serifl size="-1"</fcon><y d">6 6 6 6 /tr">6 6 6 /idabl"<b%"<y d">6 /i%"> /idabl"headable width="100%" bor<er="0" cellspacing="0" cellpadding="0">6 i%">6 6 id>&nbsp;ry d">6 /i%"> /idabl"he/<bod"ati/\a-<h
!!!!!!!!!!<CODE />!!!!!!!!!!!!!!!!!

As you can see: mojibake, because the type (in this: html) identifier is behind the starting html tag. I mean, this is the reason.

#2) Code downloaded with Firefox 48 (at first: show source code and than save file as ... you will find german interpretations and notes in the html file, not of me, from the browser)
!!!!!!!!!!<CODE>!!!!!!!!!!!!!!!!!
<!DOCTYPE html>
<html><head>
<meta http-equiv="content-type" content="text/html; charset=windows-1252"><title>http://www.fetishlady-anja.de/member/galleries/LacquerGallery1/a-f-lack0090.html</title><link rel="stylesheet" type="text/css" href="a-f-lack0090_new-Dateien/viewsource.css"></head><body contextmenu="actions" id="viewsource" class="highlight" style="-moz-tab-size: 4"><pre id="line1"><span></span><span title="Start-Tag wurde entdeckt, ohne dass ein Doctype zuerst gesehen wurde. “<!DOCTYPE html>” erwartet." class="error"><<span class="start-tag">html</span>></span><span>
<span id="line2"></span></span><span><<span class="start-tag">head</span>></span><span>
<span id="line3"></span></span><span><<span class="start-tag">title</span>></span><span>Lacquer Gallery 1 </span><span></<span class="end-tag">title</span>></span><span>
<span id="line4"></span></span><span><<span class="start-tag">meta</span> <span class="attribute-name">http-equiv</span>="<a class="attribute-value">Content-Type</a>" <span class="attribute-name">content</span>="<a class="attribute-value">text/html; charset=iso-8859-1</a>"></span><span>
<span id="line5"></span></span><span></<span class="end-tag">head</span>></span><span>
<span id="line6"></span>
<span id="line7"></span></span><span><<span class="start-tag">body</span> <span class="attribute-name">bgcolor</span>="<a class="attribute-value">#FFFFFF</a>" <span class="attribute-name">text</span>="<a class="attribute-value">#000096</a>" <span class="attribute-name">background</span>="<a href="view-source:http://www.fetishlady-anja.de/member/galleries/LacquerGallery1/images/header.jpg" class="attribute-value">images/header.jpg</a>" <span class="attribute-name">bgproperties</span>="<a class="attribute-value">fixed</a>" <span class="attribute-name">leftmargin</span>="<a class="attribute-value">0</a>" <span class="attribute-name">topmargin</span>="<a class="attribute-value">0</a>" <span class="attribute-name">marginwidth</span>="<a class="attribute-value">0</a>" <span class="attribute-name">marginheight</span>="<a class="attribute-value">0</a>" <span class="attribute-name">link</span>="<a class="attribute-value">#000096</a>" <span class="attribute-name">vlink</span>="<a class="attribute-value">#000096</a>" <span class="attribute-name">alink</span>="<a class="attribute-value">#000096</a>"></span><span>
<span id="line8"></span></span><span><<span class="start-tag">table</span> <span class="attribute-name">width</span>="<a class="attribute-value">100%</a>" <span class="attribute-name">border</span>="<a class="attribute-value">0</a>" <span class="attribute-name">cellspacing</span>="<a class="attribute-value">0</a>" <span class="attribute-name">cellpadding</span>="<a class="attribute-value">0</a>"></span><span>
<span id="line9"></span> </span><span><<span class="start-tag">tr</span>></span><span>
<span id="line10"></span> </span><span><<span class="start-tag">td</span>></span><span><span class="entity"><span>&amp;</span>nbsp;</span></span><span></<span class="end-tag">td</span>></span><span>
<span id="line11"></span> </span><span></<span class="end-tag">tr</span>></span><span>
<span id="line12"></span></span><span></<span class="end-tag">table</span>></span><span>
<span id="line13"></span></span><span><<span class="start-tag">table</span> <span class="attribute-name">width</span>="<a class="attribute-value">100%</a>" <span class="attribute-name">border</span>="<a class="attribute-value">0</a>" <span class="attribute-name">cellspacing</span>="<a class="attribute-value">0</a>" <span class="attribute-name">cellpadding</span>="<a class="attribute-value">0</a>"></span><span>
<span id="line14"></span> </span><span><<span class="start-tag">tr</span>></span><span>
<span id="line15"></span> </span><span><<span class="start-tag">td</span> <span class="attribute-name">align</span>="<a class="attribute-value">center</a>"></span><span></span><span><<span class="start-tag">br</span>></span><span>
<span id="line16"></span> </span><span><<span class="start-tag">table</span> <span class="attribute-name">border</span>="<a class="attribute-value">0</a>" <span class="attribute-name">cellspacing</span>="<a class="attribute-value">0</a>" <span class="attribute-name">cellpadding</span>="<a class="attribute-value">0</a>"></span><span>
<span id="line17"></span> </span><span><<span class="start-tag">tr</span> <span class="attribute-name">align</span>="<a class="attribute-value">center</a>"></span><span>
<span id="line18"></span> </span><span><<span class="start-tag">td</span>></span><span></span><span><<span class="start-tag">font</span> <span class="attribute-name">face</span>="<a class="attribute-value">Arial, Helvetica, sans-serif</a>" <span class="attribute-name">size</span>="<a class="attribute-value">2</a>"></span><span></span><span><<span class="start-tag">a</span> <span class="attribute-name">href</span>="<a href="view-source:http://www.fetishlady-anja.de/member/galleries/LacquerGallery1/a-f-lack0081.html" class="attribute-value">a-f-lack0081.html</a>"></span><span>first</span><span></<span class="end-tag">a</span>></span><span> | </span><span><<span class="start-tag">a</span> <span class="attribute-name">href</span>="<a href="view-source:http://www.fetishlady-anja.de/member/galleries/LacquerGallery1/a-f-lack0089.html" class="attribute-value">a-f-lack0089.html</a>"></span><span>previous</span><span></<span class="end-tag">a</span>></span><span> | </span><span><<span class="start-tag">a</span> <span class="attribute-name">href</span>="<a href="view-source:http://www.fetishlady-anja.de/member/galleries/LacquerGallery1/a-f-lack0091.html" class="attribute-value">a-f-lack0091.html</a>"></span><span>next</span><span></<span class="end-tag">a</span>></span><span> | </span><span><<span class="start-tag">a</span> <span class="attribute-name">href</span>="<a href="view-source:http://www.fetishlady-anja.de/member/galleries/LacquerGallery1/a-f-lack0112.html" class="attribute-value">a-f-lack0112.html</a>"></span><span>last</span><span></<span class="end-tag">a</span>></span><span> | </span><span><<span class="start-tag">a</span> <span class="attribute-name">href</span>="<a href="view-source:http://www.fetishlady-anja.de/member/galleries/LacquerGallery1/index.html" class="attribute-value">index.html</a>"></span><span>home</span><span></<span class="end-tag">a</span>></span><span></span><span></<span class="end-tag">font</span>></span><span></span><span><<span class="start-tag">br</span>></span><span></span><span><<span class="start-tag">br</span>></span><span></span><span></<span class="end-tag">td</span>></span><span>
<span id="line19"></span> </span><span></<span class="end-tag">tr</span>></span><span>
<span id="line20"></span> </span><span><<span class="start-tag">tr</span>></span><span>
<span id="line21"></span> </span><span><<span class="start-tag">td</span> <span class="attribute-name">align</span>="<a class="attribute-value">center</a>"></span><span></span><span><<span class="start-tag">img</span> <span class="attribute-name">src</span>="<a href="view-source:http://www.fetishlady-anja.de/member/galleries/LacquerGallery1/images/a-f-lack0090.jpg" class="attribute-value">images/a-f-lack0090.jpg</a>"></span><span></span><span></<span class="end-tag">td</span>></span><span>
<span id="line22"></span> </span><span></<span class="end-tag">tr</span>></span><span>
<span id="line23"></span> </span><span><<span class="start-tag">tr</span>></span><span>
<span id="line24"></span> </span><span><<span class="start-tag">td</span> <span class="attribute-name">align</span>="<a class="attribute-value">center</a>"></span><span></span><span><<span class="start-tag">br</span>></span><span></span><span><<span class="start-tag">font</span> <span class="attribute-name">face</span>="<a class="attribute-value">Arial, Helvetica, sans-serif</a>" <span class="attribute-name">size</span>="<a class="attribute-value">2</a>"></span><span></span><span></<span class="end-tag">font</span>></span><span><span class="entity"><span>&amp;</span>nbsp;</span><span class="entity"><span>&amp;</span>nbsp;</span></span><span><<span class="start-tag">font</span> <span class="attribute-name">face</span>="<a class="attribute-value">Arial, Helvetica, sans-serif</a>" <span class="attribute-name">size</span>="<a class="attribute-value">1</a>"></span><span>10 of 32</span><span></<span class="end-tag">font</span>></span><span><span class="entity"><span>&amp;</span>nbsp;</span><span class="entity"><span>&amp;</span>nbsp;</span></span><span><<span class="start-tag">font</span> <span class="attribute-name">face</span>="<a class="attribute-value">Arial, Helvetica, sans-serif</a>" <span class="attribute-name">size</span>="<a class="attribute-value">1</a>"></span><span></span><span></<span class="end-tag">font</span>></span><span></span><span></<span class="end-tag">td</span>></span><span>
<span id="line25"></span> </span><span></<span class="end-tag">tr</span>></span><span>
<span id="line26"></span> </span><span></<span class="end-tag">table</span>></span><span></span><span><<span class="start-tag">br</span>></span><span></span><span></<span class="end-tag">td</span>></span><span>
<span id="line27"></span> </span><span></<span class="end-tag">tr</span>></span><span>
<span id="line28"></span></span><span></<span class="end-tag">table</span>></span><span>
<span id="line29"></span></span><span><<span class="start-tag">table</span> <span class="attribute-name">width</span>="<a class="attribute-value">100%</a>" <span class="attribute-name">border</span>="<a class="attribute-value">0</a>" <span class="attribute-name">cellspacing</span>="<a class="attribute-value">0</a>" <span class="attribute-name">cellpadding</span>="<a class="attribute-value">0</a>"></span><span>
<span id="line30"></span> </span><span><<span class="start-tag">tr</span>></span><span>
<span id="line31"></span> </span><span><<span class="start-tag">td</span>></span><span><span class="entity"><span>&amp;</span>nbsp;</span></span><span></<span class="end-tag">td</span>></span><span>
<span id="line32"></span> </span><span></<span class="end-tag">tr</span>></span><span>
<span id="line33"></span></span><span></<span class="end-tag">table</span>></span><span>
<span id="line34"></span></span><span></<span class="end-tag">body</span>></span><span>
<span id="line35"></span></span><span></<span class="end-tag">html</span>></span><span>
<span id="line36"></span></span></pre><menu id="actions" type="context"><menuitem accesskey="L" label="Zu Zeile springen…" id="goToLine"></menuitem><menuitem type="checkbox" label="Lange Zeilen umbrechen" id="wrapLongLines"></menuitem><menuitem checked="true" type="checkbox" label="Syntax-Hervorhebung" id="highlightSyntax"></menuitem></menu></body></html>
!!!!!!!!!!<CODE />!!!!!!!!!!!!!!!!!

At the top of this code you will find: <cite> [...]<span title="Start-Tag wurde entdeckt, ohne dass ein Doctype zuerst gesehen wurde. “<!DOCTYPE html>” erwartet." class="error"><<span class="start-tag">html</span>[...] <cite /> Hope, you are able to read a little bit of German ;-)


#3) Code is a copy/paste from Firefox 48 (at first: show source code, than copy the code and paste it in a new blank file)

!!!!!!!!!!<CODE>!!!!!!!!!!!!!!!!!
<html>
<head>
<title>Lacquer Gallery 1 </title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
</head>

<body bgcolor="#FFFFFF" text="#000096" background="images/header.jpg" bgproperties="fixed" leftmargin="0" topmargin="0" marginwidth="0" marginheight="0" link="#000096" vlink="#000096" alink="#000096">
<table width="100%" border="0" cellspacing="0" cellpadding="0">
<tr>
<td>&nbsp;</td>
</tr>
</table>
<table width="100%" border="0" cellspacing="0" cellpadding="0">
<tr>
<td align="center"><br>
<table border="0" cellspacing="0" cellpadding="0">
<tr align="center">
<td><font face="Arial, Helvetica, sans-serif" size="2"><a href="a-f-lack0081.html">first</a> | <a href="a-f-lack0089.html">previous</a> | <a href="a-f-lack0091.html">next</a> | <a href="a-f-lack0112.html">last</a> | <a href="index.html">home</a></font><br><br></td>
</tr>
<tr>
<td align="center"><img src="images/a-f-lack0090.jpg"></td>
</tr>
<tr>
<td align="center"><br><font face="Arial, Helvetica, sans-serif" size="2"></font>&nbsp;&nbsp;<font face="Arial, Helvetica, sans-serif" size="1">10 of 32</font>&nbsp;&nbsp;<font face="Arial, Helvetica, sans-serif" size="1"></font></td>
</tr>
</table><br></td>
</tr>
</table>
<table width="100%" border="0" cellspacing="0" cellpadding="0">
<tr>
<td>&nbsp;</td>
</tr>
</table>
</body>
</html>
!!!!!!!!!!<CODE />!!!!!!!!!!!!!!!!!

This is a very well interpretation from Mozilla, although you will find the equal bug: the code identifier/doctype (in this case: html) is behind the starting html tag (this means, that people who script this, made a failure). But the Mozilla will show the page without bugs.

With the OE Pro 6.9.0.4208 I've got the same like #3 Code. I hit the box "Use Unicode symbols". (It doesn't work with OE Pro 7.2.0.4514.

Do you have any idea?

Many thanks and kr,
Siggi

Oleg Chernavin 08/19/2016 04:50 am
Can you send me more details about the site to support@metaproducts.com ? I want to download and reproduce it myself. This would allow me to fix it.

Thank you!

Oleg.