Projects

The IIPC funds technical and educational projects based on the goals outlined in an annual Request for Proposals. The consortium also collaborates on research and development projects by sharing data and testing tools. Task forces are formed to study and make recommendations on specific issues or problems. Working groups also sponsor their own projects and work packages.


Current Projects

COLLABORATIVE COLLECTIONS

Content Development 

Project coordinators: Nicola Bingham, The British Library & Alex Thurman, Columbia University Libraries

The IIPC Content Development Group will continue collaborative collections in 2018 via the IIPC Archive-It account. Planned collections include new crawling for the ongoing World War I Commemoration and International Cooperation Organizations collections and the creation of new collaborative collections on the 2018 Winter Olympics and News around the World.

To subscribe to the CDG mailing list, please email communications@iipc.simplelists.com 

PRESERVATION WORKING GROUP’S DATABASES 

Preservation

Project coordinators: Tobias Steinke, German National Library and Grace Thomas, Library of Congress, Preservation Working Group Co-chairs

The Preservation Working Group maintains a PWG Wiki and an open forum and a forum for IIPC members.

To subscribe to the CDG mailing list, please email communications@iipc.simplelists.com 

TRAINING CURRICULUM DEVELOPMENT

Training

Project coordinators: Tom Cramer, Stanford University Libraries, Abbie Grotke, Library of Congress and Maria Praetzellis, Internet Archive, Training Working Group Co-chairs

In 2018 the following activities will be supported by the IIPC:

  • Travel support for a face to face “curriculum development sprint” by members of the TWG to work out the outline and the content of the modules;
  • Retention of a consultant or contractor to produce the training curriculum; to provide a consistent voice and polished end-product (by mid-year);
  • Pilot delivery and assessment; to pilot the delivery of the training in order to assess its effectiveness. Requesting travel and meeting support for a small cohort of attendees (summer / early fall 2018).

To subscribe to the TWG mailing list,  please email communications@iipc.simplelists.com 

MEMBERSHIP SURVEY

Membership Engagement

Project coordinators: Barbara Sierman, National Library of the Netherlands (Membership Engagement Portfolio Lead), Emmanuelle Bermès, National Library of France, Abbie Grotke, Library of Congress and Aija Vahtola, National Library of Finland

The Membership Engagement Survey, “Where can I find my IIPC friends”, is intended to foster collaboration between IIPC members, based on information related to their web archiving activities, staff and techniques. The results will be presented at and used as input into the General Assembly in Wellington. The survey was designed by Barbara Sierman, KB, and Birgit Nordsmark Henriksen, the Royal Danish Library, with inputs from the IIPC Steering Committee members.

IIPC TECHNICAL SPEAKER SERIES

Tools Development 

Project coordinators: Jefferson Bailey, Internet Archive and IIPC Chair, and Olga Holownia, IIPC PCO

The IIPC will recruit 10 IIPC members (or member organisations) to present 30-60 minute online webinars on new, recent, or innovative technical projects within their organisations. The series is not intended to be training or workshop oriented, but instead provide an opportunity for members to disseminate information and showcase their work on internal technical projects that have relevance to the broader IIPC community. Speakers will be selected through direct recruitment and a forthcoming open call for proposals. Small stipends will be available for speakers, if needed.

MEMENTO

To goal of the project is to aggregate the metadata of the distributed archives of the IIPC, and

  • to provide Memento based access to the holdings of open archives
  • to provide knowledge of the holdings of restricted archives
  • to provide knowledge to IIPC members of the holdings of totally closed archives
  • initial demo for participants, then IIPC
  • no access provided to restricted archives.

Past Projects

 

CROWDSOURCING WORKSHOP & USE CASES

The project aimed to investigate how crowdsourcing web archiving activities may begin to redress that balance and increase the amount of manpower available to throughout all stages of the web archiving workflow in member institutions.

DOMAIN CRAWL REPORT

The IIPC Harvesting Practices Survey was developed  in order to understand, analyze and to collate the Internet archiving processes and experiences amongst IIPC members. The objective was to encourage and support memory institutions everywhere to address archiving and preservation of web resources by providing a benchmark and giving an overview of current web archiving practices. 

 EVALUATING TWITTERVANE

The primary goal of the project was to evaluate the Twittervane – a prototype application, which is capable of analyzing Twitter feeds and determining which websites are shared most frequently around a given theme over a given time period.

HOW TO FIT IN? INTEGRATE A WEB ARCHIVING PROGRAM IN YOUR ORGANIZATION

This IIPC-sponsored workshop was held at the Bibliothèque nationale de France (26-30 Nov. 2012). The aim was to investigate the challenges and methods involved in implementing web archiving in all mainstream activities of a heritage institution: general institution strategy, acquisition practices, IT operations, preservation, access. 

 JHONAS

The overall goal of the project was to enhance existing tools in order to ease the adaptation of WARC as the prefered archiving format for digital preservation. In order to accomplish this, two applications were chosen which would cover the entire digital preservation workflow. The two applications chosen were: JHove2 and NetarchiveSuite.

LIVE ARCHIVING HTTP PROXY

The Live Archiving Proxy (LAP) project was a collaboration between Ina and Netarkivet.dk to build an HTTP proxy that would able to capture the traffic that flows trough it, and delegate the handling of the captured data to a writer using a simple network protocol. The goal was to be able to write the captured traffic into any kind of archive format using any computer language. 

  PHD SPONSORSHIP

The University of North Texas College of Information sponsored a 3-year award to support doctoral studies in its Interdisciplinary Information Science Ph.D. Program.

STAFF EXCHANGE

The purpose of the project was to gather expert advice, assistance and guidance in the processes of migration from Heritrix 1 to Heritrix 3 and setting up distributed crawls with Heritrix 3.  

 STATISTICS AND QUALITY INDICATORS FOR WEB ARCHIVING

In 2009, the ISO Technical Committee 46 (Information and Documentation) decided to set up a working group on “Statistics and Quality Indicators for Web Archiving”. The group has delivered a Draft Technical Report (PDF) in 2013.

TWITTERVANE

Prototype/Investigatory project by the British Library to use Twitter to build a web archive collection.

WARC TOOLS PROJECT

The main goal of the WARC Tools project was to facilitate and promote the adoption of the WARC file format for storing web archives by the mainstream web development community by providing an open source software library, a set of command line tools, web server plug-ins and technical documentation for manipulation and management of web archive files, or WARC files.

WAYBACK, HERITRIX AND NUTCHWAX DOCUMENTATION

2009 project led by the Internet Archive that documented NutchWAX, Heritrix, and Wayback.

WEB ARCHIVE PROFILING VIA SAMPLING

Research project looking at how archives respond to queries for archived content and over time build up a profile of the top-level domains (TLDs), Uniform Resource Identifiers (URIs), content language, and temporal spread of the archive’s holdings.