Forum attachments again

Author Message
masster 12/01/2013 12:41 am
hi
I am stuck with a project, please give me a hand.
I have this URL starting page:

http://www.skodacommunity.de/skoda-forum/board15-skoda-felicia-forum/

I am interested in downloading all images and attachments from posts inside above forum section. So I started with a template "Download only posts from the forum topic", checked only images above 20k and archives, load from starting server plus 1 external link, load only from starting directory and below and cleared all keywords from filenames.

Problem is that attachments URLs look like this:

http://www.skodacommunity.de/index.php?page=Attachment&attachmentID=28290

so I can't reach them due to "load only from starting directory and below" setting. But I need it, otherwise OE would crawl the whole forum. I feel the need of a checkbox similar to "load from starting server plus 1 external link" but for directory. Something like "load only from starting directory and below PLUS up to 1 link outside directory" so I can reach and download all attachments.

What should I do?

Thank you.
Oleg Chernavin 12/01/2013 07:13 am
This should be easy to do. Allow downloading from all directories and use URL Filters - Filename section. Add the following included keywords there:

/skoda-forum/board15-skoda-felicia-forum/*
attachment

Best regards,
Oleg Chernavin
MP Staff
masster 12/01/2013 08:42 am
I did as you mentioned and I didn't get any file. Something looks wrong.

I am confused about 2 things:

1) the keyword /skoda-forum/board15-skoda-felicia-forum/* has nothing to do with a filename, it's a PATH a directory filter, so why add it it as a FILEname keyword??

2) keywords are tested with an OR boolean or with an AND boolean operation?
Oleg Chernavin 12/01/2013 09:40 am
1. If you specify a filename keyword that begins from /, it will affect both path and filename parts of a URL. You may also use keywords, like:

http://*.server.com/*/*.asp

in this section.

2. They use OR.

Oleg.
masster 04/30/2014 08:55 pm
Oleg,
Can you imagine my frustration seeing how NOT user friendly is OE ?? You probably say to yourself... "well, it's so easy... why don't they get it?". I answer: you're so used with OE that you THINK everybody else should know what you. Even the existance of this user-user community where a OE staff member answers to our never ending questions should give you a clue that something must be changed fundamentally towards PROJECT CREATION stage. Please take into consideration this MUCH BETTER PROJECT WIZARD:
- the user is ssked what is the starting URL
- inatead of going to next step, OE should immediately start doing a basic browsing of that URL and make a basic recognition of the structure of that site, server type, forum software used, etc... you get the idea. Call that stage Smart Prescan or similar
- only now the user is asked for more info, but rather than counting on templated (which NEVER fit 100% and have to be tweaked always) ask questions that allow the user to decide STEP-BY-STEP what they want to add to the project.

Ok now, till you will find the time to discuss my suggestion with OE team, please save me from cutting my wrists :) and think again on my initial request, because your advice didn't help. Those two keywords coupled with browsing all folders DIDN'T GET ANY IMAGE ATTACHED, EMBEDDED OR REFERRED EXTERNALLY in posts.
Oleg Chernavin 05/02/2014 04:20 pm
I agree with you. I am also thinking on some kind of tests before a Project creation and more intelligent setup of it.

Forums would be a good example of such extended Wizard. I think, I will start making something like that in 7.0 version.

However this kind of forum is pretty unusual to me. I haven't seen such structure yet. The predefined template in Offline Explorer would not work for this kind of forum. I made some improvements to the template, but still it would be not perfect enough.

So, a custom Project is still necessary here. I made several tests and the following settings started working. Please select the following text starting from the [Object] line, copy it to Windows clipboard and press Ctrl+V in Offline Explorer.

I hope this would work for you.

Oleg.

[Object]
OEVersion=Pro 6.8.4094
Type=0
IID=62396
Caption=http://www.skodacommunity.de/skoda-forum/board15-skoda-felicia-forum/
URL=http://www.skodacommunity.de/skoda-forum/board15-skoda-felicia-forum/
MVer=5
Lev=2
Weekday=257
LTExceptions=
LTExcMode=0
FTText.Exts=htmlhtmaspaspxjspstmstmlidcshtmlhtxtxttextxspxmlrxmlcfmwmlphpphp3
FTImages.Exts=gifjpgjpegtiftiffxbmfifbmppngipxjp2j2cj2kwbmplwfwebp
FTVideo.Exts=mpgavianimpegmovflvfliflcvivrmramrvasfasxwmvm1vm2vvobsmilmp4m4v
FTAudio.Exts=wavriffmp3midmp2m3uravocwmaapeoggm4aaif
FTArchive.Exts=7zziparcgzzarjlhalayleirarcabtarpakacejarpdftgzexeiso
FTUDef.Exts=jsaxdcssssivbsdtdxslswfclassent
FTText.B=ooxooo
FTImages.B=ooxooo
FTVideo.B=ooxooo
FTAudio.B=ooxooo
FTArchive.B=ooxooo
FTUDef.B=ooxooo
FTOther.B=ooxooo
FTSizes=0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,3,3,0,3,0,0,0,0,0,0,0,0
NotIgnoreLogout=False
RSrvsBx=1
RFileIn=/skoda-forum/board15-skoda-felicia-forum/*attachment/skoda-forum/skoda-felicia-forum/*/* xxx
RFileEx=?sort&sortlast-post. xxx
RProt=255
LastStart=97:206:148:93:64:100:228:64:
LastEnd=165:251:195:98:64:100:228:64:
PrjStart=6:214:172:141:218:252:227:64:
LastStarted=03.05.2014 0:16:26
LastEnded=03.05.2014 0:17:21
S200=243
S304=965
SAbr=2
SPar=811
SSav=243
SLast=200
SSiz=4739885
SMdf=243
SHTML=212
SSuccDowns=3
LFiles=1208
LSize=4739885
Flags=1
Descr=e1xydGYxXGFuc2lcZGVmZjB7XGZvbnR0Ymx7XGYwXGZuaWwgTVMgU2FucyBTZXJpZjt9fQ0KXHZpZXdraW5kNFx1YzFccGFyZFxsYW5nMTA0OVxmMFxmczE2IA0KXHBhciB9DQo=
ImgDim=0,0,0,0
PrevURL=http://www.skodacommunity.de/skoda-forum/board15-skoda-felicia-forum/
ConvertRSS=True