Content filter question
|Dmitry||01/22/2007 07:22 am|
Thank you for great support,
and I have one different question:
Is it possible for content filter to add both inclusions and exclusions.
For example I want to download pages which contain bass and trout words but do not contain word guitar?
|Oleg Chernavin||01/22/2007 07:44 am|
|Sorry, this is impossible yet. There could be only one filters rule now. Perhaps, we will redesign this feature and we will add more rules in future versions. However we don't have immediate plans for that now.
|Dmitry||01/23/2007 03:41 am|
|Thanks for answer!
I have one more question for which i did not found how to do this.
So maybe it also turns to be feature suggestion rather than question.
Are there any way to prevent directory overloading?
If I turn on file overloading, it creates new directory for each 1k files on same dir,
however if its more than 1k dirs in same dir (including root one) it does not do so.
The question is, in case it exist, how to turn it on?
And in case it not exist are there any way to go around this and avoid OE freezing if you get more than 10-20k root domains for some download even if number of files is below 50k?
If it not exists, It can be very useful to implement it some time for two reasons
First - if you do some primitive data mining from subnet of open domain, even with 10k files you can easily get 5k dirs/sites. And when number of files greater or restrictions on subdomain weaker, you can easily get >10k root sites on same dir which significantly slows system and causes freezes in OE.
Second, some sites store information in enumerated dirs which cant be easily converted to file names.
In such sites you can easily get 100k dirs with 1-3 files in each.
Thanks a lot for answers!
|Oleg Chernavin||01/23/2007 08:39 am|
|There is no solution for this issue right now. For subdomains with 1-3 files per each, it is possible to use URL Substututes to change filenames and combine directories and filenames somehow. However this depends on each particular case.