Single Website, double link confusion
|Jim Smith||10/17/2012 10:50 am|
| Is this the same as this? http://www.d20pfsrd.com/home and this https://sites.google.com/site/pathfinderogc/ i mean as OE will perceive them. They seem to be the same website but is there a difference? The d20 or the pathfinderogc? Will it make a difference in OE?
What to do on website cases with similarities?
|Oleg Chernavin||10/17/2012 01:33 pm|
|These are two different sites for Offline Explorer, because it is distinguishing URLs, not contents.
I am thinking to add some feature (maybe optional) to handle such sites. An obvious example is www.site.com and site.com with exactly the same contents.
|Jim Smith||10/17/2012 04:34 pm|
| I have tried the firstname.lastname@example.org version and its currently 3,67 GB with no sign of coming to any end. The map has many sites, though only the main site takes up so much space and everything is being downloaded from it.
Any way to tell if i should have chosen this one? http://www.d20pfsrd.com
|Oleg Chernavin||10/17/2012 04:45 pm|
|You should use URL Filters - Directory section to allow downloading from the starting directory only. Please also go through the File Filters section and select "Load using URL Filters" in their Location boxes.
|Jim Smith||10/17/2012 04:50 pm|
| It already ticked Properties -> URL filters -> Directory -> Load files only within the starting directory and below
Maybe the site is truly so huge? I checked the files and most file types (64,8 of the total) taking up space have no file extension whatsoever, 21,300 in number.
|Jim Smith||10/17/2012 05:06 pm|
|I used your suggestion, "Load using URL Filters" in each file type and i will restart it now.|
|Oleg Chernavin||10/19/2012 08:24 am|
|I see a number of strange links, like:
Maybe exclude them using URL Filters - Filename - Excluded list:
|Jim Smith||10/20/2012 11:20 am|
| These two websites seem to be one or dependent on each other. Some things are only on the second one, while some are on the first one.
If you download only one of them you miss a ton of things. Problem is that its too big. Just downloading the first one was 5 gb and i have no idea how much the second one will be.
I don't get it. 50,000+ files and still missing things as they are in the second website. Every other site i have downloaded is from 10 mb to 200 mb at most.....
|Jim Smith||10/20/2012 11:49 am|
| I am thinking of these settings;
All using Load using file filter settings
Load files only with the starting Domain
Enter multiple server keywords
But i am having trouble as to what else to exclude. Most pages are text with the rare image. It should not be so large.
|Oleg Chernavin||10/21/2012 08:57 am|
|Can you watch the Queue tab while downloading? Perhaps, there are other kinds of useless URLs that can be excluded to minimize the download.
|Jim Smith||10/21/2012 09:16 am|
|I can but i have difficulty discerning which is useless and which is not. How can i exclude files with not file extension? Can i? Will the site function without them?|
|Oleg Chernavin||10/21/2012 10:44 am|
|Just post a few examples that you may seem suspicious here. I will look at them and advise you.