limitations on file sizes

Author Message
Melissa Petrilla 03/29/2006 07:02 am
Hi Oleg,

I`m trying to download files from a particular site, and trying to restrict the files that are downloaded based on their file size. Files that are 10K or smaller are empty pages on this site, and those that contain valid data are @12-13K or larger. I`ve selected the "load only the selected file sizes" filters and have the minimum size set @ 11K, yet I`m still pulling down data with lower sizes. Any suggestions?

Here is the URL I`m using:

http://docapp8.doc.state.ok.us/servlet/page?_pageid=394&_dad=portal30&_schema=PORTAL30&doc_num={:100000..110000}


I was successful with restricting the file sizes once before with another site, and even though I`ve set the settings the same, it hasn`t worked with any other website I`ve tried.

Thanks for your help!

Melissa
ssieloff 03/29/2006 07:02 am
Melissa --

I have noticed this also -- seems to not always work -- I try to eliminate "no matches found" pages by restricting download sizes > some value but get all the files anyway. I sort them in Explorer by ascending file size and delete them from there -- leaving me with the desired files.

Steve

> Hi Oleg,
>
> I`m trying to download files from a particular site, and trying to restrict the files that are downloaded based on their file size. Files that are 10K or smaller are empty pages on this site, and those that contain valid data are @12-13K or larger. I`ve selected the "load only the selected file sizes" filters and have the minimum size set @ 11K, yet I`m still pulling down data with lower sizes. Any suggestions?
>
> Here is the URL I`m using:
>
> http://docapp8.doc.state.ok.us/servlet/page?_pageid=394&_dad=portal30&_schema=PORTAL30&doc_num={:100000..110000}
>
>
> I was successful with restricting the file sizes once before with another site, and even though I`ve set the settings the same, it hasn`t worked with any other website I`ve tried.
>
> Thanks for your help!
>
> Melissa
Melissa Petrilla 03/29/2006 07:02 am
Steve--

Yes, I`ve done that as well, and it`s no problem with a small volume of files from a site. In this case, I`m estimating the total download volume to be upwards of 800K files, and I`d REALLY rather not have to take the time to sort and remove the empty ones... :-(

Thanks for your comments though-- it`s nice to know I`m not the only one that has struggled with this. :)

Melissa



> Melissa --
>
> I have noticed this also -- seems to not always work -- I try to eliminate "no matches found" pages by restricting download sizes > some value but get all the files anyway. I sort them in Explorer by ascending file size and delete them from there -- leaving me with the desired files.
>
> Steve
>
> > Hi Oleg,
> >
> > I`m trying to download files from a particular site, and trying to restrict the files that are downloaded based on their file size. Files that are 10K or smaller are empty pages on this site, and those that contain valid data are @12-13K or larger. I`ve selected the "load only the selected file sizes" filters and have the minimum size set @ 11K, yet I`m still pulling down data with lower sizes. Any suggestions?
> >
> > Here is the URL I`m using:
> >
> > http://docapp8.doc.state.ok.us/servlet/page?_pageid=394&_dad=portal30&_schema=PORTAL30&doc_num={:100000..110000}
> >
> >
> > I was successful with restricting the file sizes once before with another site, and even though I`ve set the settings the same, it hasn`t worked with any other website I`ve tried.
> >
> > Thanks for your help!
> >
> > Melissa
Oleg Chernavin 03/29/2006 07:02 am
Melissa,

I think, you should set the size limitation in the File Filters | Other section of the Project Properties dailog. The URL you gave me doesn`t have any extension, so this kind of file is treated by the Other section.

Let me know if this helps or not.

Thank you.

Oleg.
Melissa Petrilla 03/29/2006 07:02 am
Oleg,

I had actually set the limitation for the File Filters | Text, User Defined, and Other sections. I`m also experiencing a problem where some of the pages are coming in incomplete, as if there is a disconnect or lag somehow in the retrieval of the page.

I appreciate any help you may be able to provide.

Melissa


> Melissa,
>
> I think, you should set the size limitation in the File Filters | Other section of the Project Properties dailog. The URL you gave me doesn`t have any extension, so this kind of file is treated by the Other section.
>
> Let me know if this helps or not.
>
> Thank you.
>
> Oleg.
>
Oleg Chernavin 03/29/2006 07:02 am
Melissa,

Can you please send me the Project settings to support@metaproducts.com ? Please select the Project, click Copy button on toolbar and then paste it to the E-mail message. I will see what is wrong there.

Regarding broken files - do you mean broken HTML files or others?

Thank you.

Oleg.