How to reduce the size of website ?

Sebastian
08/10/2011 06:02 am
Hallo, I have a few questions. I want to download 100+ websites (mostly wordpress blogs and forums). So I have to watch for the size. First of all my settings:

- Level limit is unchecked
- I don?t need external websites ("load files only within the starting server" & "load files only within the starting directory and below" are checked) except "pictures" and "archive" like .pdf, .rar,... which i load from any site.
- I don?t need Audio & Video (both unchecked), just the links to watch it (I think "mixed" in "link translation" is correct ?)

1. Are there any special settings or tips which I can do to make the Websites smaller ? E.g. the wordpress blog which I recently downloaded was about 13 Gb big ! Important for me is to get the whole information of the site, I don?t need coloured backround or something like that. Maybe I can execlude something, like special picture files like icons. Or the search function on the sites makes it such big ? I don?t know.

2. Very important for me is to get the whole website without missing links, pages or miscellaneous. Maybe I can do some settings in "connection" (e.g. less connections?) and "Speed" (e.g. lower speed ?) to increase the safety of getting all pages, links,... ?

3. Is it more useful to put all the websites in one project or every website in a seperate project ? I have to update all the websites 1 or 2 times per month.

4. What is the difference between "load files only within the starting server" and "load files only within the starting directory and below" or what is better for me ?


I know, a lot of questions :) Maybe the new Offline Explorer Pro 6.0 have a solution.

Thanks for your work

Regards
Oleg Chernavin
08/11/2011 01:36 pm
0. I would suggest to use Online Links Translation. This way, the links to videos (restricted in the settings) will point online and you can watch them easily.

1. You may exclude small images by setting File Filters - Images - minimum file size to, say, 10 kb.

I can't exactly suggest anything else. You may give me some such sites examples. I will see what can be excluded.

Sometimes sites have different links that actually point to the same pages with same content. If you faced that, give me some URLs examples, I will see how to optimize that.

2. It highly depends on the site. If it is a reputable and serious hoster, then default settings of 10 connections and 1 second between downloads is OK.

3. It is better to separate by Projects. You will be able to update them differently, export, etc.

4. It depends on how URLs are organized on the server. For example, if it is hosted on a separate server name, like http://someuser.site.com/ then the first is OK.

But if the same server lists users as directories:
http://www.site.com/user_1/
another site is:
http://www.site.com/user_2/
...

then use the second option. It will not allow to load User_2 if you start with the first URL.

Best regards,
Oleg Chernavin
MP Staff
Sebastian
08/16/2011 03:40 am
Hallo Oleg, thanks for yout reply.

0. For the blog which I downloaded I used "mixed". If I click for example on a youtube video, then it works or if I click on an external link it redirects me to the website. It seems that it works with "mixed". Maybe I don?t understand the difference between "online" & "mixed".

1. Ok I have 2 examples for forums:

- http://tiny.cc/af17g
- http://tiny.cc/qeg5w

and a blog

- http://tiny.cc/84awj

e.g if I don't need "members/user" or "recent posts" can I execlude them ?

3. Yes, true. Also if use different usernames & passwort for forums.

4. Ok I?m a bit confused, with first option do you mean the project wizard ?
I haven't used it.

"Server" : "load files only within the starting server / domain"
"Directory" : "load files only within the starting directory and below"
"Filename" : "load files only within the starting filename"

Can we say, the more I click of the options above, the more I limit my download ?

Best regards,

Sebastian

Oleg Chernavin
08/16/2011 12:12 pm
0. The result is the same if the original link starts with http://..... But if it starts from / then in mixed mode it will not work offline.

1. Add to the URL Filters - Filename section - Excluded list:

member.php

I didn't find the Recent Posts link - but you may figure out how to exclude it yourself.

4. 1st - Only "Server" : "load files only within the starting server / domain" is checked.
2nd - Both "Server" : "load files only within the starting server / domain"
and
"Directory" : "load files only within the starting directory and below"
are checked.

"Filename" : "load files only within the starting filename" is useless for most uses. It was added mostly for consistensy.

5. Yes.

Oleg.