iipc netpreserve.org contact
site search with google:
 
software
about:
mission
who we are
member archives
for members
join the iipc
working groups
press releases
publications:
reports
events:
conferences and
 workshops

2012 general assembly
software:
toolkit
downloads

Downloads

In the perspective of setting up a Web archiving chain, the following tools are recommended and used by members of the IIPC:

Acquisition

Heritrix, an open-source, extensible, Web-scale, archiving quality Web crawler
Developed by the Internet Archive with the Nordic National Libraries
Current versions: Heritrix 3 (2011-10-21); Heritrix 1.X (2010-05-10) and Heritrix 2 (2008-11-08)
More information: http://crawler.archive.org
Download: http://sourceforge.net/projects/archive-crawler

DeepArc, a portable graphical editor which allow users to map a relational data model to an XML Schema and export the database content into an XML document
Developed by the National Library of France
Current version: 1.0rc1 (January 18, 2005)
More information: http://bibnum.bnf.fr/downloads/deeparc/
Download: http://sourceforge.net/projects/deeparc/

Curator Tools

Web Curator Tool (WCT), a tool for managing the selective Webharvesting process is designed for use in libraries and other collecting organisations, and supports collection by non-technical users while still allowing complete control of the Webharvesting process. The WCT is now available under the terms of the Apache Public License.
Developed by the National Library of New Zealand and the British Library and initiated by the International Internet Preservation Consortium
Current version: WCT 1.5.2 (2011-08-22)
More information and download:http://webcurator.sourceforge.net/

NetarchiveSuite, a curator tool allowing librarians to define and control harvests of web material. The system scales from small selective harvests to harvests of entire national domains. The system is fully distributable on any number of machines and includes a secure storage module handling multiple copies of the harvested material as well as a quality assurance tool automating the quality assurance process.
Developed by the Royal Library and the State and University Library in the virtual organisation netarchive.dk
Current version: 3.6.0 (July 3, 2008)
More information and download: http://netarchive.dk/suite

Collection storage and maintenance

BAT (BnFArcTools), an API for processing ARC, DAT or CDX files
Developed by the National Library of France
Current version: 0.07 (February 3, 2005)
More information and download: http://bibnum.bnf.fr/downloads/bat/

Access and finding aids

Wayback, a tool that allows users to see archived versions of web pages across time.
Developed by the Internet Archive
Current version: 1.6.0 (2011-01-03)
More information and download: http://archive-access.sourceforge.net/projects/wayback/

NutchWAX (Nutch with Web Archive eXtensions), a tool for indexing and searching Web archives using the Nutch search engine and extensions for searching Web archives
Developed by the Internet Archive and the Nordic National Libraries
Current version: 0.13 (2010-03-19)
More information and download: http://archive-access.sourceforge.net/projects/nutch/

WERA (WEb aRchive Access), a Web archive search and navigation application. WERA was built from the NWA Toolset, gives an Internet Archive Wayback Machine-like access to Web archives and allows full-text search.
Developed by the Internet Archive and the National Library of Norway
Current version: 0.4.1 (January 17, 2006)
More information and download: http://archive-access.sourceforge.net/projects/wera/

Xinq (XML INQuire), a search and browse tool for accessing an XML database
Developed by the National Library of Australia
Current version: 0.5 (July 26, 2005)
More information: http://www.nla.gov.au/xinq/
Download: http://sourceforge.net/projects/xinq/


Valid XHTML 1.0! top | © 2004-2011 IIPC | copyright and privacy statements | credits
iipc