Is it possible to load an old page of a project?
I have several projects that are scheduled to download new/modified files nightly, and was just wondering what happens when a website takes content off and I end up downloading a new page without the old links.
Please advise.
Thanks
Best regards,
Oleg Chernavin
MP Staff
I'm more concerned when pages get updated. How can I access the previous version of a page?
For instance, a new index.html now redirects to a new site, instead of listing links as before. I'd like to load the old index.html, if possible.
Thanks
Oleg.
I'd like to know:
1) OEP somehow determined there were changes to a website page(s), and it downloaded only new versions (I think). Is there a way to determine what changed between those versions? If not that would be a good feature. I think a page that displays the current date/time would always make the page look like it had been changed.
2) Is there a way to view a prior website version in its entirety, with older pages displayed instead of the newer versions? Currently we have to view a single old page at a time, and links don't work.
2. Also a rarely asked feature. You may simply use a macro to save a site to a different folder on every download:
DD=c:\download\{:year}{:0month}{:0day}\
Oleg.
I forgot to mention. This is a WIX site. WIX makes extensive use of AJAX.
OEP recognizes the URLs but apparently doesn't capture the pages.
The only way I know to download all the pages in the site is to provide the SEO URL of *each* page.
The pages aren't formatted properly, and many of the linked-to URLs aren't resolved, but at least I can capture the main text of the website.
I used the AJAX SEO format for each page. The format is:
-- current version:
http://www.WebSiteName.com/?_escaped_fragment_=page-name-A/xy11
http://www.WebSiteName.com/?_escaped_fragment_=page-name-B/fg23
-- prior versions:
http://www.WebSiteName.com/?_escaped_fragment_=page-name-A/xy11_YYMMDDNN
The Google writeup for SEO pages is:
https://developers.google.com/webmasters/ajax-crawling/
It would be great for OEP to generate a set of static HTML that will format the pages properly. It would be nice if it functioned properly, but that's too much to ask for.
A guide to get HTML snapshots of each page is:
https://developers.google.com/webmasters/ajax-crawling/docs/html-snapshot
Thank you!
Oleg.