Redirect

Author Message
Steven 12/18/2009 12:24 am
When I add the URL http://nf.nfdaily.cn/rwzk

OP didn't download the redirected one which is http://nf.nfdaily.cn/rwzk/20091214/
Oleg Chernavin 12/18/2009 02:07 am
I made a simple Project with Level=1 and downloaded it. Redirection went well. Here is the log:

HTTP0 - 18.12.2009 10:05:05 - Connecting to host nf.nfdaily.cn...
HTTP0 - 18.12.2009 10:05:08 - GET /rwzk/ HTTP/1.1
- - Accept: */*
- - Accept-Encoding: gzip
- - Host: nf.nfdaily.cn
- - Connection: Keep-Alive
HTTP0 - 18.12.2009 10:05:08 - Host nf.nfdaily.cn connected. Waiting for http://nf.nfdaily.cn/rwzk.
HTTP0 - 18.12.2009 10:05:08 - HTTP/1.0 302 Moved
- - Location: http://nf.nfdaily.cn/rwzk/
HTTP0 - 18.12.2009 10:05:08 - Download complete. Status: 302 Object Moved.
HTTP0 - 18.12.2009 10:05:09 - Delay 1 seconds before http://nf.nfdaily.cn/rwzk/.
HTTP0 - 18.12.2009 10:05:10 - Connecting to host nf.nfdaily.cn...
HTTP0 - 18.12.2009 10:05:10 - GET /rwzk/ HTTP/1.1
- - Accept: */*
- - Cookie: ASPSESSIONIDAATTRBDD=OPAOEHDBGGADBIMMBFHJHILK
- - Accept-Encoding: gzip
- - Host: nf.nfdaily.cn
HTTP0 - 18.12.2009 10:05:10 - Host nf.nfdaily.cn connected. Waiting for http://nf.nfdaily.cn/rwzk/.
HTTP0 - 18.12.2009 10:05:10 - HTTP/1.1 200 OK
- - Date: Fri, 18 Dec 2009 07:02:56 GMT
- - Server: Microsoft-IIS/6.0
- - X-Powered-By: ASP.NET
- - Content-Length: 194
- - Content-Type: text/html
- - Cache-control: private
HTTP0 - 18.12.2009 10:05:10 - Reading data
HTTP0 - 18.12.2009 10:05:10 - Download complete.
QUEUE - 18.12.2009 10:05:10 - Parsing (0). http://nf.nfdaily.cn/rwzk/
QUEUE - 18.12.2009 10:05:11 - Parsing end.
HTTP0 - 18.12.2009 10:05:11 - Delay 1 seconds before http://nf.nfdaily.cn/rwzk/20091214/.
QUEUE - 18.12.2009 10:05:11 - Parsing files added.
HTTP0 - 18.12.2009 10:05:12 - Connecting to host nf.nfdaily.cn...
HTTP0 - 18.12.2009 10:05:12 - GET /rwzk/20091214/ HTTP/1.1
- - Referer: http://nf.nfdaily.cn/rwzk/
- - Accept: */*
- - Accept-Encoding: gzip
- - Host: nf.nfdaily.cn
- - Cookie: ASPSESSIONIDAATTRBDD=OPAOEHDBGGADBIMMBFHJHILK
HTTP0 - 18.12.2009 10:05:12 - Host nf.nfdaily.cn connected. Waiting for http://nf.nfdaily.cn/rwzk/20091214/.
HTTP0 - 18.12.2009 10:05:12 - HTTP/1.1 200 OK
- - Date: Fri, 18 Dec 2009 07:02:58 GMT
- - Server: Microsoft-IIS/6.0
- - X-Powered-By: ASP.NET
- - Content-Length: 32386
- - Content-Type: text/html
- - Cache-control: private
HTTP0 - 18.12.2009 10:05:12 - Reading data
HTTP0 - 18.12.2009 10:05:21 - Download complete.
QUEUE - 18.12.2009 10:05:21 - Parsing (0). http://nf.nfdaily.cn/rwzk/20091214/

I used the latest Offline Explorer Pro 5.7 Service Release 2. Do you have this version? One way is to enable logging (Ctrl+Q) and see what error or project setting prevents the redirection from being followed.

Best regards,
Oleg Chernavin
MP Staff
Steven 12/18/2009 04:14 am
Here is the log

HTTP0 - 2009-12-18 16:47:30 - Connecting to host nf.nfdaily.cn...
HTTP0 - 2009-12-18 16:47:30 - GET /rwzk/ HTTP/1.1
- - Accept: */*
- - User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
- - Accept-Encoding: gzip
- - Host: nf.nfdaily.cn
- - Cookie: __utma=43321850.2638502452139413500.1237728670.1249879269.1249979365.4; __utmz=43321850.1249979365.4.4.utmcsr=news.baidu.com|utmccn=(referral)|utmcmd=referral|utmcct=/ns; __gads=ID=433aa7e326a62853:T=1247469117:S=ALNI_MZjuzDwSjrJegB3OXtO19oAFzZ9WQ; __utma=107208413.500130419.1253272208.1253272208.1253272208.1; __utmz=107208413.1253272208.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); ASPSESSIONIDAATTRBDD=KJLPEHDBOEIICNCMIKIEOMEF
HTTP0 - 2009-12-18 16:47:30 - Host nf.nfdaily.cn connected. Waiting for http://nf.nfdaily.cn/rwzk/.
HTTP0 - 2009-12-18 16:47:30 - HTTP/1.1 200 OK
- - Date: Fri, 18 Dec 2009 08:47:04 GMT
- - Server: Microsoft-IIS/6.0
- - X-Powered-By: ASP.NET
- - Content-Length: 194
- - Content-Type: text/html
- - Cache-control: private
HTTP0 - 2009-9-11 16:47:30 - Reading data
HTTP0 - 2009-12-18 16:47:30 - 100% of 194 bytes of http://nf.nfdaily.cn/rwzk/.
HTTP0 - 2009-12-18 16:47:30 - Download complete.
QUEUE - 2009-12-18 16:47:30 - Parsing (0). http://nf.nfdaily.cn/rwzk/
QUEUE - 2009-12-18 16:47:30 - Parsing end.
QUEUE - 2009-12-18 16:47:30 - Parsing files added
Oleg Chernavin 12/18/2009 04:15 am
Can you allow showing rejected links in the Filter button of the Log toolbar? Then repeat the download and see what reason it shows.

Oleg.
Steven 12/18/2009 05:48 am
It says PARSER - 2009-12-18 18:39:39 - Rejected URL (URL Filters | Filename | Included files keywords): http://nf.nfdaily.cn/rwzk/20091214/

Ah, I may have found the reason, I set "asp" as the included file keyword because all the links in the second level include that, but why the URL http://nf.nfdaily.cn/rwzk/20091214/ was not filtered when I set it as the project URL?

Is it because that when using the link http://nf.nfdaily.cn/rwzk/ , http://nf.nfdaily.cn/rwzk/20091214/ actually becomes the second level which is subjected to filtering? In this case then, can I avoid that?
Oleg Chernavin 12/18/2009 06:47 am
Yes, this is so. The starting address will be downloaded no matter what filters or keywords you use. I think, this is logical. But in the above case, it was the second-level link. To allow such URLs, add default.htm to the Included filename keywords list.

Oleg.
Steven 12/18/2009 07:28 am
But this would pevent me from downloading links such as: http://nf.nfdaily.cn/rwzk/20091214/gj/200912160014.asp
in the third level.
Oleg Chernavin 12/18/2009 08:47 am
Just have two keywords in the Included filenames list:

default.htm
asp

Oleg.
Steven 12/19/2009 02:36 am
Thank you so much~
Oleg Chernavin 12/19/2009 08:27 am
You are welcome!

Oleg.