How do you download only those pages with "TITLE:" in the text?
- User Forums
- Offline Explorer Pro
- How do you download only those pages with "TITLE:" in the text?
Author | Message | |
---|---|---|
Len Lydik | 10/21/2004 06:17 pm | |
How do you download only those pages with "TITLE:" in the text? | ||
10/21/2004 10:40 pm | ||
> How do you download only those pages with "TITLE:" in the text?
How would you use this filter? It is possible that OEP saves only pages on your HDD which contain "TITLE:" in its content. Use the "Content Filters". But AFAIK you can *not* avoid that OE downloads and parses files with other content (they are downloaded by OE, but they are not saved on disk). I guess that you are searching for another filter: "Do not follow any links on pages which haven`t "TITLE:" in its content AND Save only pages on disk which have "TITLE:" in its content" That would be a really useful filter. Perhaps Oleg can implement such a filter? |
||
Oleg Chernavin | 10/22/2004 04:31 am | |
Content Filters already support this. Place TITLE: in the keywords field and keep all checkboxes in the section unchecked. This will load all pages and save only those that contain the keyword.
Best regards, Oleg Chernavin MP Staff |
||
10/22/2004 06:47 am | ||
> Content Filters already support this.
What exactly? Len said that he wants to *download* only the pages that contain "TITLE:" in the text. He didn`t say anything about *saving* files on disk; Len could clarify this... > Place TITLE: in the keywords field and keep all checkboxes in the section unchecked. > This will load all pages and save only those that contain the keyword. This is the same as I said before. "This will load all pages..." -> this should be avoided (in most cases). The filter doesn`t work like this: "Do not follow any links on pages which haven`t "TITLE:" in its content AND Save only pages on disk which have "TITLE:" in its content" Or am I wrong? I think that this filter could be very useful. Do you agree? Could you add such a filter type to the "Content Filters"? What do you think? Thank you! |
||
Oleg Chernavin | 10/22/2004 06:52 am | |
Do you mean to follow links, which tag contains TITLE:, like:
<a href="somepage.htm">TITLE: aaa bbb</a> Is it what you need? Oleg. |
||
Len Lydik | 10/22/2004 12:10 pm | |
Oleg`s solutions seems to be working. I`m just looking to save only the files with the string "TITLE:" in the content.
I don`t know how OEP would accomplish this without downloading all files (how could it analyze a file it hasn`t downloaded). |
||
10/22/2004 06:32 pm | ||
OK. Maybe I wasn`t clear enough. I will describe it once more.
> (how could it analyze a file it hasn`t downloaded). Of course OE can`t analyze a file without downloading it. > I don`t know how OEP would accomplish this without downloading all files "all files" -> That`s the thing: OE shouldn`t download *all* unwanted files (in most cases), only the files that are absolutely necessary to analyze. And this is what my filter should do. OE wouldn`t follow (wouldn`t download) any links on pages which haven`t "TITLE" in their content. In this way OE would download unwanted files only *one* level deeper than the wanted files. But currently OE would download *every* page at any level. I try to explain it in an example: ------------------------------- Level 0, page A: Has TITLE in its content Links in page A: B C D Level 1, page B: Has TITLE in its content Links in page B: E F G Level 1; page C: Has *not* TITLE in its content Links in page C: H I J Level 1; page D: Has *not* TITLE in its content Links in page D: K L M Pages on Level 2 which have TITLE in their content: E, F, G Pages on Level 2 which have *not* TITLE in their content: H, I, J, K, L, M ------------------------------- The result with the current filter: OE downloads: A, B, C, D, E, F, G, H, I, J, K, L, M OE saves on disk: A, B, E, F, G --- The result with my filter: OE downloads: A, B, C, D, E, F, G OE saves on disk: A, B, E, F, G --- Compare the results: With my filter OE does *not* download the pages H, I, J, K, L, M I think that both filters are useful (the current and my suggestion). For example, my filter would be useful when you want to follow a specific chapter on a site (like the example above). @ Oleg I hope that you agree and you can implement such a filter. > Do you mean to follow links, which tag contains TITLE:, like: > > <a href="somepage.htm">TITLE: aaa bbb</a> > > Is it what you need? It`s not the filter that I mentioned. But indeed, I have looked for such a filter before, without any positive result. I`m pretty sure that many people would love to have such a filter, me too. ;-) So, this would be the third filter in this topic. I hope that you can realize the 2 new filters. :-) Thanks in advance! |
||
Oleg Chernavin | 10/25/2004 12:35 pm | |
OK. I see, so you need a new checkbox, like "Do not follow links in pages that do not contain the above keywords". Is it so?
Oleg. |