iipc netpreserve.org contact
site search with google:
 
about
mission
who we are
member archives
for members
join the iipc
working groups
press releases
publications:
reports
events:
conferences and
 workshops

2012 general assembly
software:
toolkit
downloads

Deep Web Working Group

Goals & objectives

The objective of the Deep Web Working Group is to identify strategies and produce tools for archiving Web content which is inaccessible to crawlers. To this end the Testbed and Metrics Working Group and crawlers have a role to play in the identification of such sites.

A two-pronged approach has been taken, with one sub-group developing tools for archiving and providing access to the content of such sites where a source database has either been deposited or is accessible. The second sub-group will research strategies for automating the extraction of content from such sites where there is no relationship with the producer such that the former is possible.

Expected deliverables

  • Tools for the ingestion of deposited databases and document archives into a long-term preservation format
  • The provision of access tools to search and navigate these structured data archives (which are stored as XML) via the web
  • Tools for the extraction of content from deep Web sites where no contact with provider is available

Leader

Monica Berko, National Library of Australia

Others: Framework / Researchers Requirements / Access Tools / Metrics and Testbed / Deep Web / Content Management


Valid XHTML 1.0! top | © 2004-2011 IIPC | copyright and privacy statements | credits
iipc