URLs containing "@" symbol resulting in corrupted files

Author Message
Michael 04/21/2011 07:48 pm
Certain websites, once downloaded, appear to contain corrupted files that the OS (I've tried both WinXP Pro 32-bit and Win7 Ent 64-bit) invariably reads as being ~4 GB in size. I've followed several theories about what the problematic websites might have in common, but none of them has panned out. This is a significant problem, since my 1 TB drive fills up from a set of projects that should only occupy a few hundred MB at most.

Below are four examples of websites that consistently end up corrupted on disk. They all (for me, at least) contain at least one 4 GB file after they're downloaded to disk, even though OEP reports their downloaded size as a tiny fraction of that. I'm running OEP 5.9.3347 Service Release 5 (given to me by Oleg a few days ago in response to a different issue). For all of the URLs below, I have OEP set to accept all file extensions (but Other is unchecked), and a level limit of 0. If anyone can help me, I'd greatly appreciate it!


http://www.amnesty.org/en/news-and-updates/philippines-move-protect-womens-rights-during-armed-conflict-2010-03-31

http://www.nytimes.com/2010/10/24/opinion/24friedman.html

http://meta.wikimedia.org/wiki/Main_Page

http://www.probeinternational.org/foreign-aid/foreign-aid-and-bad-government
Michael 04/21/2011 07:49 pm
Arg, sorry. The subject line is a remnant of one of the theories I investigated. Please disregard it.
Michael 04/21/2011 08:25 pm
Here are my project settings, if that's helpful.


Stream 1.2 File
[Object]
OEVersion=Pro 5.9.0.3347
Type=0
IID=7010
Caption=TestURLs
URL=http://www.probeinternational.org/foreign-aid/foreign-aid-and-bad-government
MVer=5
Weekday=257
LimTSize=10000
LimNumber=5000
LimTime=100
CheckSize=True
FTText.Exts=htmlhtmaspaspxjspstmstmlidcshtmlhtxtxttextxspxmlrxmlcfmwmlphpphp3
FTImages.Exts=bmpfifgifipxj2cj2kjp2jpegjpglwfpngtiftiffwbmpwebpxbm xxxxxxxxxxxxxxxx
FTVideo.Exts=aniasfasxaviflcfliflvm1vm2vm4vmovmp4mpegmpgramrmrvsmilvivvobwmv xxxxxxxxxxxxxxxxxxxxx
FTAudio.Exts=apem3um4amidmp2mp3oggrariffvocwavwma xxxxxxxxxxxx
FTArchive.Exts=acearcarjcabexegzjarlayleilhapakpdfrartartgzzzip xxxxxxxxxxxxxxxxx
FTUDef.Exts=.axdclasscssdtdenthtcjsssiswfvbsxsl xxxxxxxxxxxx
FTText.B=ooxooo
FTImages.B=ooxooo
FTVideo.B=ooxooo
FTAudio.B=ooxooo
FTArchive.B=ooxooo
FTUDef.B=ooxxoo
FTOther.B=xoxxoo
FTSizes=0,0,0,0,0,2,0,2,0,2,0,2,0,2,0,0,0,0,0,0,0
NotIgnoreLogout=False
RSrvsEx=zedo.comzdnet.comdoubleclick.netmsads.net xxxx
RProtBx=2
RProt=11
LastStart=71:238:214:215:215:217:227:64:
LastEnd=193:140:159:218:215:217:227:64:
LastStarted=4/21/2011 5:52:56 PM
LastEnded=4/21/2011 5:53:25 PM
S200=122
SAbr=25
SPar=42
SSav=122
SLast=200
SSiz=691930
SMdf=122
SHTML=21
SSuccDowns=52
LFiles=122
LSize=664485
Copies=8
Flags=1
SubstsB=Kgk/CS0NCg==
ImgDim=0,0,0,0
PrevURL=http://www.probeinternational.org/foreign-aid/foreign-aid-and-bad-government
ExploreSSMaps=True
SkipURLs=http://$callback_host/static/css/*http://)(.*/?)/.*/i;var b=/(^.*\?)(.*)/ig;var g=http://ad.*http://adadvisor.net*http://adlog*http://ads.*http://adserver*http://adsfeed*http://adsvr*http://adsyndication.msn.com/*http://b.scorecardresearch.com/*http://banners.*http://c|/http://d.yimg.com/*http://dfid.gov.uk/ukgwacnf.htm*http://graphics8.nytimes.c/http://http://*http://images-eu.ssl-images-amazon.com/images/g/01/nav/amazon/amzn-logo-118w.gifhttp://images-fe.ssl-images-amazon.com/images/g/01/nav/amazon/amzn-logo-118w.gifhttp://images-na.ssl-images-amazon.com/images/g/01/nav/amazon/amzn-logo-118w.gifhttp://js_5.41.0.17542/http://jssettings_5.41.0.17542/http://lib.store.yahoo.net/lib/storeid/*http://newswm.bbc.co.uk/cgi-bin/change_edition.plhttp://pathtoplayerdirectory/*http://sitelife.guardian.co.uk/ver1.0/content/images/no-user-image.gifhttp://te.nytimes.com/tte/blank.gifhttp://thecable.foreignpolicy.com/posts/2010/04/08/fp - blog - the cable - fp%20*http://www.cceia.org/resources/video/data/000034http://www.foreignpolicy.com/articles/2010/04/15/fp - article - fp%20*http://www.irinnews.org/app_themes/all/app_themes/all/*http://www.irinnews.org/app_themes/all/inc/app_themes/all/*http://www.irinnews.org/app_themes/all/inc/inc/app_themes/all/*http://www.irinnews.org/app_themes/asi/app_themes/all/*http://www.irinnews.org/app_themes/asi/inc/app_themes/all/*http://www.irinnews.org/app_themes/asi/inc/inc/*http://www.irinnews.org/inc/images/app_themes/all/inc/*http://www.irinnews.org/inc/images/inc/app_themes/all/inc/*http://www.irinnews.org/inc/images/inc/inc/*http://www.irinnews.org/sharedresources/images/app_themes/all/inc/*http://www.irinnews.org/sharedresources/images/inc/app_themes/all/*http://www.irinnews.org/sharedresources/images/inc/inc/*http://www.pbs.org/engage/sites/all/themes/pbstheme09/css/images/www.pbs.org/*http://www.probeinternational.org/foreign-aid/sites/all/modules/ajaxify/*http://www.probeinternational.org/foreign-aid/static/r05/sites/all/modules/ajaxify/*http://www.probeinternational.org/foreign-aid/static/r07/sites/all/modules/ajaxify/*http://www.probeinternational.org/foreign-aid/static/t00/sites/all/modules/ajaxify/*http://www.probeinternational.org/themes/pixture/sites/all/modules/ajaxify/*http://www.somaliweyn.org/pages/news/june_08/bilder/gif/pages/pages/topmeny/*http://www.somaliweyn.org/pages/news/june_08/bilder/gif/pages/topmeny/*http://www.somaliweyn.org/pages/news/june_08/bilder/gif/topmeny/menu/pages/pages/topmeny/*http://www.somaliweyn.org/pages/news/june_08/bilder/gif/topmeny/menu/pages/topmeny/*http://www.somaliweyn.org/pages/news/june_08/bilder/gif/topmeny/menu/topmeny/*http://www.somaliweyn.org/pages/news/june_08/pages/topmeny/menu/pages/pages/topmeny/*http://www.somaliweyn.org/pages/news/june_08/pages/topmeny/menu/pages/topmeny/*http://www.somaliweyn.org/pages/news/june_08/pages/topmeny/menu/topmeny/*http://www.somaliweyn.org/pages/poems/sep03/topmeny/menu/pages/pages/topmeny/*http://www.somaliweyn.org/pages/poems/sep03/topmeny/menu/pages/topmeny/*http://www.somaliweyn.org/pages/poems/sep03/topmeny/menu/topmeny/*http://www.somaliweyn.org/topmeny/menu/pages/pages/topmeny/*http://www.somaliweyn.org/topmeny/menu/pages/topmeny/*http://www.somaliweyn.org/topmeny/menu/topmeny/*https://commerce./*https://www.cqpress.com/pages/custom
ConvertRSS=True
LIndexed=True
IndexFiles=False
Michael 04/22/2011 10:49 pm
Aha! OEP is working correctly; I'm sorry for the false alarm.

The problem disappears when I save downloaded files to my local hard drive. (I was saving them to a network-attached storage drive across a LAN. For some reason, saving the files in that fashion is resulting in file corruption. I will have to investigate that further.)
Oleg Chernavin 04/23/2011 06:51 am
I am sorry for the late reply. One idea to check is the maximum filename length limitation - maybe some files violate this? There is a protection against such cases in Offline Explorer, but could be good to check anyway.

Regarding symbols in names - all of them should be compatible with Windows filesystems.

Please keep me informed about your investigation. Thank you!

Best regards,
Oleg Chernavin
MP Staff