I am trying to download specific content (jpg's and mov's) from the members area of amateurallure.com. I have configured the project as best I can but I am receiving unwanted files and duplicates.
I am trying to only download *.mov and *.jpg files but NOT certain .mov and .jpg files (via excluded entires).
Example: I want video01.mov and video01.jpg but NOT videosm01.mov or video_th.jpg or any other video or image type.
I am also getting 2 of every file downloaded, which wastes bandwidth and space.
Lastly, files that aren't video or image are being downloaded - I don't know what these are but they are the same size as video files and have the video file name in them. I have tried to exclude them but they are being downloaded anyway.
Example: This is one of many files that have been downloaded that are neither .mov or .jpg - http://www.amateurallure.com/members/_girls/maelynn/_video/maelynn02.mov?PSSO=ei9Vb2RraGgramVtUUlxTEpjc1Q4aDRiVXlITDI3d3E1Mm84S0tJeUpsTG5iK2E5cTNpTmJiY1ZmODVNVVVWZQpyTUN5dS94VlQ3MzM2SXlra2MwZzZjQzlLdWYvZVlmdAo*
My project settings are... (I am providing only settings that are selected and/or not blank)
Project > Addresses: http://www.amateurallure.com/members/
Level Limit: unchecked
Do Not download existing files: selected
File Filters > Images (jpeg and jpg) and Videos (mov) checked
Load Using URL filter settings selected for both
URL Filters > Directory: Load files only within the starting directly and below selected
FileName: Load files only with the starting filename is NOT selected
Excluded Keywords: *_th.jpg *sm0*.mov *.mov@psso* *.mov?psso*
Content Filters > When keywords are found in a page - save these pages is checked
All settings beneath advanced are the defaults - only thing I have added is the download location and userid password.
The directory structure off of the members folder is...
I think some of these have the same content beneath them but in different folder names and locations hence the duplicates).
Any help you could offer would be greatly appreciated. I realize there is a lot of information here but I figure the more you know the better you'll be equipped to assist.
LastStarted=11/8/2011 7:04:53 AM
LastEnded=11/8/2011 7:38:13 AM
The log clearly shows that the _tn.jpg files and the video files with .mov?psso=.... are not allowed for the download by the parser. It works correctly.
Regarding the duplicates - did you try to uncheck the "Check files integrity" box? Does it help? If not, please describe me what kind of duplicates do you have and what are their URLs. Also, on which pages I can find these links.