One of the main objectives of the IIPC has been to develop a set of high-quality, easy-to-use open source tools for setting up a web archiving chain. The working groups have identified the following requirements and focused their work on:
- An archival-quality crawler capable of web-scale operation.
- A portable database extraction and migration tool for database-driven Deep Web sites.
Focused selection and verification
- Analysis and prioritization tools for dynamic crawl re-focusing.
- User-friendly interfaces for curatorial activities such as selecting, monitoring and verifying archived websites.
Collection storage and maintenance
- File manipulation tools.
Access and finding aids
- An interface for browsing web archive file containers, providing management of the linking environment, URI presentation, and temporal navigation.
- A full-text indexer, scalable to large collections and minimally supporting boolean operators, proximity queries, and the temporal dimension of the archive.
- A query interface generator for archived databases.