Meetup Photos

Author Message
Daniel D 03/15/2011 02:42 pm
I'm having a hard time figuring out the proper settings to accomplish what I want.

I help organize a motorcycle meetup group on meetup.com. There service has frankly been poor lately and we are switching our CMS provider - my goal is to save all our photos from past years.

I'm at: http://www.sportbikealliance.com/photos/

I need to to go into every one of those albums. Above each photo is an "all sizes" button. This gives you the option to download the original JPEG. I think that should satisfy our needs. I'm noticing all those original jpegs start with highres*.jpg a couple folders deep. Also, the domain changes to stuff like img*.meetupstatic.com or photos*.meetupstatic.com for the images.

So essentially, i need to crawl the page, stick to jpeg beginning with highres from variations of those two domains.

Can someone help me out?
Oleg Chernavin 03/15/2011 03:50 pm
Yes. Let's do the following:

Project Properties - URL:
http://www.sportbikealliance.com/photos/

Level = unlimited.

File Filters - Text, Image and others - select Load using URL Filters in their Location boxes.

URL Filters - Server and Directory - load from all.
URL Filters - Filename - Included list:

http://*.meetupstatic.com/photos/member/*/*.jpeg
http://www.sportbikealliance.com/photos/*/*

Parsing - URL Substitutes - add two rules:

URL:
*.jpeg
Replace:
/600_
With:
/highres_

URL:
*.jpeg
Replace:
/thumb_
With:
/highres_

This should do the task.

Best regards,
Oleg Chernavin
MP Staff
Daniel D 03/16/2011 09:23 am
I'll try it but isn't your URL wrong? This is an example of one original JPEG:

http://photos2.meetupstatic.com/photos/event/a/e/d/9/highres_21224761.jpeg
Oleg Chernavin 03/16/2011 09:29 am
Yes, I saw all these URLs. What is wrong?

Oleg.
Daniel D 03/16/2011 09:48 am
First off, thank you for your help.

Its running right now. Doing some quick browsing through the folders and so far all I'm seeing is member photos.

We are on the same page that I'm trying to download all the original jpegs in the event albums - correct? Not the member/profile/avatar photos.
Oleg Chernavin 03/16/2011 09:50 am
I made some general filtering in this case. So, some extra pages may appear, but not much, I think.

Oleg.
Daniel D 03/16/2011 10:24 am
Its done. Didn't quite do it. It downloaded small thumbnails of peoples avatar/member photos and thats it. I ended up with 4 folders:

www.sportbikealliance.com - several subdirs -no photos (expected)

photos4.meetupstatic.com
photos3.meetupstatic.com
photos2.meetupstatic.com
photos1.meetupstatic.com


inside of each of those is a single photos directory; then a single member directory

thats full of single character folders that eventually drill down to a small avatar thumbnail.
Oleg Chernavin 03/16/2011 10:37 am
OK. Please correct the URL Substitutes rules:

URL:
*/600_*.jpeg
Replace:
/600_
With:
/highres_

URL:
*/thumb_*.jpeg
Replace:
/thumb_
With:
/highres_

Oleg.
Daniel D 03/16/2011 02:48 pm
OK running that now. I"m kind of confused on how all the filters come together but I trust ya.

I'm definitely lost on why we have the /members in the URL for meetupstatic. Shouldn't it be /event?
Oleg Chernavin 03/16/2011 03:18 pm
Maybe I missed some other links. You may add another Included Filename keyword:

http://*.meetupstatic.com/photos/event/*/*.jpeg

Or even simplier - shorten to one keyword:

http://*.meetupstatic.com/photos/*/*.jpeg

Oleg.

Daniel D 03/17/2011 04:34 pm
OK - running now. Taking a lot longer so I think its working. Quick question in the meantime - how do I know when this is done? The activity log stops, queue shows 1000 and the project still has a green downward arrow on it.
Oleg Chernavin 03/17/2011 04:36 pm
What the Download Panel shows (on the bottom)?

Oleg.
Daniel D 03/17/2011 05:26 pm
everything was gray.
Oleg Chernavin 03/17/2011 07:25 pm
There is a Status Bar below the progress. What do you see in its left corner - Ready or Parsing?

Oleg.
Daniel D 03/18/2011 08:21 am
its just sitting on downloading. not actually doing anything though
Oleg Chernavin 03/18/2011 08:47 am
There is a Status Bar below the progress. What do you see in its left corner - Ready or Parsing?

Oleg.
Daniel D 03/18/2011 09:56 am
neither..
Daniel D 03/18/2011 10:57 am
Oleg, are you Russian btw?
Oleg Chernavin 03/18/2011 11:53 am
Yes, I am.

What happens if you stop the download and continue it in the "Do not load existing files" selected (in the Project Properties dialog)?

Oleg.
Daniel D 03/18/2011 12:29 pm
Zdraste. Ia toze.
Daniel D 03/18/2011 12:30 pm
So you want me to stop the download, set that property, and start again?
Oleg Chernavin 03/18/2011 12:46 pm
Yes. This way it will skip all previously loaded files and will continue from the point where it stopped.

Oleg.
Daniel D 03/20/2011 02:04 pm
It always ends up sitting on Downloading but not doing anything. I'm not sure how to guarantee it crawled everything there is to crawl.
Oleg Chernavin 03/20/2011 02:34 pm
Can you please write me to support@metaproducts.com ? Perhaps, I will need more details to analyse the problem.

Oleg.