I am trying to download only the files on that page for a specific date, example of a URL:
Here are the settings:
Level limit: 1
Download all files
File Filters properties:
Only the Other box is checked:
and Location is: Load using URL filters settings, and there are no file sizes rectrictions.
Protocol, server, directory are set to all.
Now for the filenames include list: &date=2005-10-01
There is no exclusions
Everything looks ok when you look at the queue files being downloaded, in this example 38 files, corresponding to the date string.
But when you try to export the data, NOTHING!
There is nothing is the downloaded directory to export, nothing was created!
In an older version, when I exported the data, all the links where in html format, and I could then convert the files to text.
Why is there no data to export ?!?
But I cannot parse the text using a HTML to text, because when you look at the source of the page downloaded with Offline Explorer, the format is one big paragraph of html code.
When you open the same page with Microsoft IE, and you save the page as htm/html only, when you look at the source, it`s preserve at the right format of HTML code or body. So that when you use an HTML to txt parser, it works.
Exemple of a URL:
What I need to do is save that page with all the HTML info or tags at the right place (will all the text info) if you like, so that after my HTML to text program can parse it.
Any solution ?!?
Only checking the htm/html in the File Filters | Text category improved results.
I checked the source of the downloaded file, and surprise! Offline Explorer puts those two lines of code:
<!-- saved from url=(0089)http://www.tsnhorse.com/cgi-bin/instant.cgi type=inc&country=usa&track=bm&date=2005-10-01 -->
These lines in the downloaded source file(s) is responsible for the html to txt parsing problem bug.
Removing those two lines, and presto! No more problem when parsing the file(s).
Is there a way for Offline Explorer not to add those two lines ?
Your software is simply the best available!
Thanks for your help!