MetaProducts Systems

Contact Feedback My Cart

Crawling for images and meta data

User Forums
Offline Explorer Pro
Crawling for images and meta data

Author

Message

G. Jung

10/04/2004 12:45 pm

I own Offline Explorer Pro, but I would like to crawl PUBLIC DOMAIN pages like the following for both the images and meta data:

http://www.metaproducts.com/mp/mpSupport_User_Forums_Topic.asp?topic=7

What products would I need to crawl this page for both the image and meta data?

Thanks much!

Oleg Chernavin

10/04/2004 01:51 pm

Offline Explorer will easily download pages with images. But what do you mean under "meta data"?

Can you please explain it in details?

Thank you!

Best regards,
Oleg Chernavin
MP Staff

Len Lydik

10/04/2004 02:11 pm

I mean the fielded text on the pages, fielded into a spreadsheet, CSV or some other format.

For example, on the following page, I would like the image AND the fielded text:

<a href="http://www.civilwar.nps.gov/cwss/petersburgd.cfm?id=1845">target page</a>

William K. Smith (First_Last)
Company: K
Unit Number: 106th
Rank:
State/Federal: Pennsylvania
Military Organization: Infantry
Date of Death: June 22, 1864
Original Burial Place: Fort Hell
Gravestone Number: 2975
Comments: Killed

Where the fields are:

Name:
Company:
Unit Number:
Rank:
State/Federal:
Military Organization:
Date of Death:
Original Burial Place:
Gravestone Number:
Comments:

10/04/2004 04:58 pm

Hmm, I also still don`t know what`s your problem?

OEP can download the page exactly as you see it online, i.e.:

Level: 0
"Text" and "Images":
Location: Load only from the starting server

If you want to extract some parts of the text, you could use a tool like Textpipe...

Oleg Chernavin

10/05/2004 12:45 am

Yes, you will need to setup TextPipe Pro to extract the data after you download the pages. Please go to the Tools | Data Mining menu to get more details.

Oleg.

MetaProducts Systems Privacy Practices

This statement discloses the privacy practices for the MetaProducts® Web site. Questions regarding this statement should be directed to MetaProducts Systems at: info@metaproducts.com. This statement may change from time to time without notice. A current version of this statement will remain available at http://www.metaproducts.com/mp/mplegal_privacy.asp.

Personal Information

We may ask you for certain personal information for purposes such as, but not limited to, newsletter subscription or product registration. If you choose to give us such information, you can be assured that we do not share or sell our customers' personal information to anyone.

If you have provided us your personal information so that we may contact you, and later decide that you do not wish to be contacted by MetaProducts, we will respect your wishes.

Web Tracking Information

We do keep track of the domains from which people visit us. We analyze this data for trends and statistics and then we discard it.

Information Security and Quality

We intend to protect the quality and integrity of your personally identifiable information. We have implemented appropriate technical and managerial procedures to maintain information that is accurate, current and complete. We will make a sincere effort to respond to your requests to correct personal information inaccuracies in a timely manner.

Business Relationship

The MP site contains links to other Web sites. MP is not responsible for the privacy practices or the content of such Web sites.

Cookies

There are various technologies, including one called "cookies", which can be used to provide you with tailored information from a Web site. A cookie is an element of data that a Web site can send to your browser, which may then store it on your system.

Requests for Information and Legal Requirements

We may, at our option, choose to comply with a request for personal information based on a bona fide complaint of illegal or unauthorized activity. We also reserve the right to comply with any court or agency order to release personal information, or otherwise release information as required by law.

If you have any questions or comments about our privacy practices, you can contact us at: customer service.

MetaProducts Systems Web Site Copyright

The copyright in all material provided on this Web site ("Site") is held by MetaProducts Systems ("MP") or by the original creator of the material. Except as stated herein, none of the Site material (other than MP shareware) may be copied, reproduced, distributed, republished, downloaded, displayed, posted or transmitted in any form or by any means, including, but not limited to, electronic, mechanical, photocopying, recording, or otherwise, without the prior written permission of MP or the copyright owner.

MetaProducts Systems End User License Agreement

TRADEMARKS

TrayIcon™, TrayIcon Standard™, TrayIcon Explorer™, TrayIcon Folders™, TrayIcon Menu™, TrayIcon Professional™, TrayIconCE™, WinGOT™, WinGO CE™, Staff Directory™, Staff Directory CET™, MetaProducts® Offline Explorer™, Web Downloader™, MetaProducts® Inquiry™, AutoDialogs™, Mass Downloader™, StartUp Organizer™, Links Organizer™, Disk Watchman™, DeskTool™, MetaProducts® Download Express™, Net Activity Diagram™, Web Studio™, Download Library™, LightPad™, MetaTree™, Integra™, Portable Offline Browser™, Flash and Media Capture™, Picture Downloader™, AlphaProducts™, BetaProducts™ and MetaProducts are trademarks of MetaProducts Systems. Copyright © 1995-2026 by MetaProducts Systems. All rights reserved.

IMPORTANT: PLEASE READ THIS AGREEMENT CAREFULLY BEFORE USING THE SOFTWARE.

END USER LICENSE AGREEMENT

MetaProducts Systems ("MPS") agrees to provide the user ("USER") with a copy of this software product ("SOFTWARE"), and grants the USER a limited license to use the SOFTWARE. ("LICENSE") This LICENSE defines what the USER may do with the SOFTWARE, and contains limitations on warranties, liabilities and remedies. This LICENSE may be revoked by MPS at any time without notice if the USER fails to comply with the terms of this LICENSE. The copyright and all other rights in the SOFTWARE shall remain with MPS.

LICENSE OF UNREGISTERED SOFTWARE

An unregistered copy of the SOFTWARE ("UNREGISTERED SOFTWARE") may be used by the USER for evaluation purposes for a period of thirty (30) days following the initial installation of the UNREGISTERED SOFTWARE. ("TRIAL PERIOD") At the end of the TRIAL PERIOD, the USER must either register the SOFTWARE or remove it from his system. The UNREGISTERED SOFTWARE may be freely copied and distributed to other users for their evaluation.

LICENSE OF REGISTERED SOFTWARE

A registered copy of the SOFTWARE ("REGISTERED SOFTWARE") allows the USER to use the SOFTWARE only on a single computer or network, and only by a single user at a time. If the USER wishes to use the SOFTWARE for more than one user, the USER will need a separate license for each individual user. The USER is allowed to make one copy of the REGISTERED SOFTWARE for back-up purposes.

DISTRIBUTION OF UNREGISTERED SOFTWARE

The uninstalled, UNREGISTERED SOFTWARE may be freely copied and distributed to other users provided the USER complies with the following requirements. If the USER offers this uninstalled, UNREGISTERED SOFTWARE for download ("SHAREWARE SITE USER"), then the SHAREWARE SITE USER agrees to: (1) immediately replace this version of the uninstalled, UNREGISTERED SOFTWARE with a new version of this SOFTWARE if a new version is released by MetaProducts, or (2) delete this version of the UNREGISTERED SOFTWARE immediately upon written email notice by MetaProducts

TERM OF LICENSE

This LICENSE shall continue for as long as the USER uses the REGISTERED SOFTWARE and/or distributes the UNREGISTERED SOFTWARE accordiongaccording to the terms of this agreement. However, this LICENSE will terminate if the USER fails to comply with any of its terms or conditions. The USER agrees, upon termination, to destroy all copies of the REGISTERED and/or UNREGISTERED SOFTWARE. The limitations of warranties and liability set forth in this LICENSE shall continue in force even after termination.

ACCEPTANCE OF THIS LICENSE AGREEMENT

By downloading and/or installing this SOFTWARE, the USER agrees to the terms of this LICENSE.

LIMITATIONS OF USE

The USER agrees not to use the SOFTWARE as part of any illegal activity, or to violate any rights of a third party. This LICENSE grants rights to use this SOFTWARE, but does not grant any legal rights to content owned by any third party, nor does the LICENSE release the USER from any responsibilities regarding the rights of third parties. MPS makes no representations involving the legality of any activities regarding the use of the SOFTWARE in conjunction with copyrighted content. The USER retains full responsibility to determine the extent of the USER's rights, and, if necessary, to contact the owner of copyrighted content prior to use of the SOFTWARE. Use of this SOFTWARE to violate the legal rights of any third party constitutes failure of the USER to comply with the terms of this LICENSE, and therefore terminates the USER's rights to use the SOFTWARE.

DISCLAIMER OF WARRANTY AND LIABILITY

This SOFTWARE is provided "as is" without representations or warranties of any kind, whether expressed or implied. The USER must assume the entire risk of using the SOFTWARE, and MPS shall have no liability to the USER or any other third-party for any damages whatsoever, including, but not limited to, any economic or data loss, even if such loss was foreseeable by MPS. Any violation of the intellectual property rights of any party as a result of the use of the SOFTWARE is explicitly against the terms of this LICENSE, and MPS disclaims any liability for the use of the SOFTWARE in this way as unauthorized and outside the scope of any warranty or agreement between the USER and MPS.

MPS DISCLAIMS ALL OTHER WARRANTIES, EITHER EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE, WITH RESPECT TO THE SOFTWARE AND THE ACCOMPANYING WRITTEN MATERIALS. ANY LIABILITY OF MPS WILL BE LIMITED EXCLUSIVELY TO PRODUCT REPLACEMENT OR REFUND OF ORIGINAL PURCHASE PRICE.

OTHER RESTRICTIONS

The USER may not rent, lease, sublicense, translate, disassemble, reverse engineer, or de-compile the SOFTWARE, or modify or merge the SOFTWARE with any part of the software in another program. This LICENSE may not be assigned or otherwise transferred without the prior written consent of MPS.

INVALID PROVISIONS

If any provision of this LICENSE shall be declared invalid or unenforceable, the remaining provisions of this LICENSE shall remain in full force and effect to the fullest extent permitted by law. In such event, each provision of this LICENSE which is invalid or unenforceable shall be replaced with a provision as similar in terms to such invalid or unenforceable provision as may be possible which is legal and enforceable.

ENTIRE AGREEMENT

This LICENSE is the entire agreement between MPS and the USER, and supersedes any other agreement, oral or written, and may not be changed except by a written signed agreement.

GOVERNING LAW

This agreement shall be governed by the laws of the State of Florida.

MetaProducts Systems Terms of Use

TERMS OF USE

Permission is granted to display, copy, distribute and download the materials on this Site for personal, non-commercial use provided you do not modify the materials, that you retain all copyright and other proprietary notices contained in the materials, and that you abide by the terms of the End User License agreemen for the distribution and use of software. You may not, without MP's permission, "mirror" any material contained on this Site on any other server. This permission terminates automatically if you breach any of these terms or conditions.

These Terms of Use constitute a legal agreement between you and MP. By using this Site, you acknowledge that you have read, understood, and agree to be bound by the Terms of Use. If you do not agree to the terms, do not use this Site. Each use of the site requires that you agree to the most current version of the Terms of Use, available at http://www.metaproducts.com/mp/mpLegal_copyright_TOU.asp. We reserve the right to revise these Terms without notice.

COPYRIGHT

The copyright in all material provided on this Web site ("Site") is held by MetaProducts® Systems ("MP") or by the original creator of the material. Except as stated herein, none of the Site material may be copied, reproduced, distributed, republished, downloaded, displayed, posted or transmitted in any form or by any means, including, but not limited to, electronic, mechanical, photocopying, recording, or otherwise, without the prior written permission of MP or the copyright owner.

MetaProducts Systems Trademarks

AutoDialogs™
Inquiry Professional Edition™
Internet Research Suite™
Offline Explorer™
Offline Explorer Pro™
Startup Organizer™

Disk Watchman™
Inquiry Standard Edition™
Links Organizer™
Offline Explorer Enterprise™
Portable Offline Browser™