Is there any way to have .KML/.KMZ files being parsed for containing URLs?
KML is a XML with xml-like internal structure.
KMZ is simply a ZIPped .KML
Say, if I start a project with the URL http://mw2.google.com/mw-earth-vectordb/gallery_layers/ngm/zipusa/2009_01_09/ru/root.kmz
What should I do next?
1. Get the file from the queue and from Internet than
2. Unzip it, if it is KMZ. If it is a KML - skip to next.
3. Parse the file for URLs (threat as a standart XML file)
4. Add any found URL to the queue
5. Repeat 1...5 until the queue is exhausted.
How to do such of thing, please?? :)
Thanks,
Alex
Is there an example of KMZ that really links to other non-contained KML files?
Best regards,
Oleg Chernavin
MP Staff
Others have TONS of internal links (mostly to another KML\KMZs stored at different folders within the same server). Also there might be more than one KML file inside KMZ package. See this one:
http://mw2.google.com/mw-ocean/ocean/kml/ark/en/0/root.kmz
PS: and yes, I need all those images (as you mentioned) as well. In fact, I need ALL resources within all links inside the KML\KMZ (images and all those kml\kmz referring to each other), to keep the whole files structure locally on my server's HDD (for caching purposes).
I will try to work on this, but not sure if it will take a lot of time or not. It is not easy because I have to unpack the file and pack it back after URLs are changed to offline ones.
Oleg.
I need to duplicate the file structure on the selected server (say, http://mw2.google.com/).
Most of URLs in the KML file are not absolute (like "pics/icon.png", but not "http://mw2.google.com/pics/icon.png"), so there is no any need to modify the KML itself.
If there is an absolute path to the different server - just add it to the queue as it is, and OE will download that to the different folder (say, /Download/mw3.google.com/* instead of /Download/mw2.google.com/*) without any modifications of the KMLs. I need them in their original state - just to mirror'em as they are at their original location. And my Squid will deal with all absolute URLs when they are in the /cache folder, dont worry about that. :)
So, the idea is to mirror the whole KML/KMZ structure (including all crosslinked files) from the server to my HDD, without modifications of the files themselves. Just get the KML, save it, [unpacksanitize_seed_3pncu3gsfk4k84s8ws8gkgkk+sanitize_seed_3pncu3gsfk4k84s8ws8gkgkk]sanitize_seed_3pncu3gsfk4k84s8ws8gkgkkopen the file, parse for the next links and drop it. Nothing needs to be to modified in the files.
PS: or may you have other idea to do the task? May any scripts be attached to OE to do such of job? Im familiar with Perl and part of the job may be scripted out (like a part of unzipping the files), etc. Just yet no idea how to attach the script to OE, except than a proxy-script way...:(
Oleg.
Судя по всему парсинг kml не приоритетная задача? Жаль, это даст возможность загрузки panoramio. Может быть найдете время?
http://www.panoramio.com/panoramio.kml?LANG=ru_RU.utf8&BBOX=8.964843750,42.293564192,9.140625000,42.423456518
http://www.metaproducts.com/download/betas/OEP3836.zip
Oleg.