Hi,
I create OEP Projects to download job announcements from the websites of employers -- usually small-to-medium nonprofits. A typical Project features one starting webpage, and one level of links. The starting webpage may contain 0-12 links to current job announcements, plus 6-60 irrelevant and seldom-changing links (like Contact, Home, Press, Mission, Privacy, etc.)
My default template relies on an "inclusive" filtering approach, which specifies what URLs must include in order to be downloaded as links. For example, a URL must include the starting server and directory, or directories with certain keywords.
But sometimes the Project webpage displays no job announcement links at the time of my visit. Or the webpage displays many job links, and the different job links refer to different directories, sometimes even different domains. In either case, I cannot rely upon an "inclusive" filtering approach. Instead, I must use an "exclusive" approach, specifying what is not to be downloaded -- and such an approach works very well. But it can be a real pain in yin-yang to set up. So to speak.
And that is what I am writing about.
Sometimes there are large numbers of links on a single webpage that I must disable from loading. And I cannot do this through reverse filters, because the job links I wish to download are likely to share the same domains, directories, etc, as the links I wish to block. So entire URLs, down to the filenames, must be disabled from loading.
First, I typically configure such a Project to download all links for all text file types (+ pdf, doc, rtf), from all directories. Once downloaded, I use the Map tab to view and select each unwanted filename, then I run the "disable from loading" command via the context menu.
I repeat the process as necessary -- and it can be tedious because sometimes there are nearly as many directories as there are files, so I cannot select multiple files, and I must expand the directory tree one directory at a time. Even disabling a single file can require 4 separate mouse clicks plus scrolling.
I would like this process to be faster. So I have a few questions and suggestions:
[1] Is there a hotkey or command or display setting that will automatically expand a Project's directory trees within the Map tab? If not, would you add such a feature?
[2] Is there a single hotkey for the "disable from loading command"? Some other context-menu commands have hotkeys listed, but not this command. (I am asking about a single hotkey, one that would save me from calling up the context menu at all -- just like the F4 hotkey does for the "edit" command.)
[3] Is there a better way to create such a Project? If not, would you consider improving OEP in this respect?
For example, I would like to open the starting webpage in a browser, then highlight the desired job links, then select from the context menu an OEP command to "disable all other links from loading."
And for the starting webpages containing no job announcements at the time of my visit, I would like to select everything on the webpage (Ctrl+A), then select from the context menu an OEP command to "disable all selected links from loading".
Obviously, the last command could be useful in many other scenarios too.
I apologize for the length of this message. I'll keep any follow-ups brief.
Thanks!
Steve K.
There is a way to create such projects faster - simply browse to the desired page online in the Internal browser, select the portion of text with the wanted links on the page and drag the selection to OE. It will create a Project with the Included keywords that correspond to the links in the selection.
Best regards,
Oleg Chernavin
MP Staff