http://groups.yahoo.com/group/foobar/message/{:1..500}
(using a directory filter of message)
Now the problem comes in when yahoo inserts a `redirect` when reading messages - it redirects to an ad every so many messages looked at - and give an option to continue on to the message. OEP gets stuck there - it tends to just skip over retreiving that message. I tried changing the scan depth - doesn`t seem to help - it does re0port a 304 error message saying that the object has moved.
Is there any workaround for this ?
group/foobar
Now select Level=0 and start loading. This should help.
Best regards,
Oleg Chernavin
MetaProducts corp.
>
> group/foobar
>
> Now select Level=0 and start loading. This should help.
>
> Best regards,
> Oleg Chernavin
> MetaProducts corp.
>
>
I tried this and still not all messages are downloaded, for example from http://groups.yahoo.com/group/BardonPraxis/message/{:1..1840} about 100 msgs are not downloaded properly, i.e. they are 180 bytes containing the following information:
"<HTML><HEAD><META HTTP-EQUIV="Refresh" CONTENT="0; URL=../interrupt@st=1&h=360&m=1&done=_252Fgroup_252FBardonPraxis_252Fmessage_252F360"><TITLE>302 File moved</TITLE></HEAD></HTML>"
So I deleted those and then used "Download missing files" and OE got about 80 msgs and the rest 20 were again redirects so I tried the same procedure for those 20 and so on till I got all 100 msgs correctly. Though this works it could be very time consuming with larger message boards.. So is there any other way to get all messages?
Oleg.
http://login.yahoo.com/config/login
post=.done=http%3A%2F%2Fgroups.yahoo.com%2Fadultconf%3Faccept%3DI%2520Accept%26dest%3D%252F&login=USERID&passwd=PASSWORD&.persistent=y
(level limit 0)
to get all cookies set appropriately to be logged in and be identified as adult.
A second project
http://groups.yahoo.com/group/GROUP1/message/{:NUM1..NUM2}
http://groups.yahoo.com/group/GROUP2/message/{:NUM3..NUM4}
http://groups.yahoo.com/group/GROUP3/message/{:NUM5..NUM6}
(level limit 0, urlfilter all directories)
on first run then retrieved most of the messages, only leaving out
a few where the advertisement page had been inserted (every fourth
message was redirected to an interrupt... url but most of those
redirected back, and those messages were also loaded).
A second run retrieved all the remaining messages.
Later versions of OEP didn`t do it any longer, the automatic login
stopped with "download complete" long before everything necessary
had been touched, every fourth message was missing (those which
had been redirected through the interrupt... url), a second
attempt gave the result "nothing to be done".
Can you find out how the change in behavior was caused?
I use the following URL (with STEP set to max value, i.e. 100):
http://groups.yahoo.com/group/GROUP_NAME/messages/{LAST_MESSAGE_NO..FIRST_MESSAGE_NO|STEP}?threaded=1&viscount=-STEP&expand=1
For example:
http://groups.yahoo.com/group/textpipe-discuss/messages/{496..1|30}?threaded=1&viscount=-30&expand=1
Then you only need to specify a few filters in external datamining tool (TextPipe) and you can get what you want, with even better interface than Yahoo! provides.
--
Radek
Oleg.