contact |
|
International Internet Preservation Consortium
|
Monday 30, April 2012 - "The Broad Value of Web Archives: Demonstrated Use" - Open to the public. | |||
Time | Notes | ||
8:30 am | Breakfast | Mumford | |
9:00 am | Welcome & Framing remarks | ||
9:20 am | Researcher Use Cases | ||
A decade and a half of archiving the web for data mining: Lessons learned and how users use web archives | Kalev Leetaru, University of Illinois | ||
How web archives are used in the Text REtrieval Conference (TREC) | Ian Soboroff, National Institute of Standards and Technology (NIST) | ||
10:25 am | Break | Mumford | |
11:00 am | Using web materials in researching contemporary terrorism | Bruce Hoffman, Georgetown University | |
The Challenges of Researching the Social Web | Stuart W. Shulman, University of Massachusetts | ||
11:40 am | Discussion | ||
12:00 pm | Lunch | Montpelier | |
1:00 pm | Trends in Archive Use | ||
Data Mining in News Data from Multiple Media | Claude Mussou & David Rapin, Institut national de l'audiovisuel (InA) | ||
Trends in Pandora | Monica Omodei, National Library of Australia (NLA) | ||
Actual and potential users of the BnF web archives: experiences and expectations | Clément Oury and Peter Stirling, Bibliothèque nationale de France (BnF) | ||
2:10 pm | Discussion | ||
2:30 pm | Break | Mumford | |
2:45 pm | Business Use | ||
Web archives for investigation, e-discovery and compliance for the legal industry | Rod Wittenberg, Reed Technology and Information Services | ||
Web archives to meet regulatory, management, e-discovery and cultural heritage needs | Mark Williamson, Hanzo Archives | ||
3:25 pm | Discussion | ||
3:40 pm | Break | Mumford | |
3:55 pm | Use in the Public Sphere | ||
Harvesting from the harvest: Automatic extraction of state government publications from web archives | Kathleen Kenney, State Library of North Carolina | ||
How can Web Archives become a critical component of today's Internet? | Leïla Medjkoune, Internet Memory | ||
Web Archiving as part of a Research Library Special Collection: the Latin American Government Documents project | Kent Norsworthy, University of Texas at Austin | ||
Discussion | |||
5:00 pm | Adjourn |
Tuesday May 1, 2012 - General Assembly - IIPC members only | ||
Start time | Notes | |
8:30 am | Breakfast | Mumford |
9:00 am | Chair speech | Martha Anderson, Library of Congress |
Program Officer update | Aaron Binns, Internet Archive | |
Treasurer update | Clément Oury, Bibliothèque nationale de France (BnF) | |
10:00 am | Break | Mumford |
Communications & Membership update | Abbey Potter, Library of Congress | |
Website redesign presentation | ||
New member presentations | Mumford | |
Web Archiving at Columbia University: Collecting Web Content for Research | Robert Wolven, Columbia University Libraries | |
12:00 pm | Lunch | Montpelier |
Web Archives at George Washington University | Daniel Chudnov, George Washington University | |
Estonian Web Archive: Preserving the Estonian Mind | Jaanus Kõuts, National Library of Estonia | |
Los Alamos National Laboratory | Herbert Van de Sompel, Los Alamos National Laboratory | |
Project Updates | ||
IIPC Memento Aggregator | Robert Sanderson, Los Alamos National Laboratory | |
How to fit in? Integrate a web archiving program in your organization | Clément Oury, Bibliothèque nationale de France (BnF) | |
JhoNAS, WARC support in JHove2 and NetarchiveSuite | Nicholas Clarke, Netarchive.dk | |
Twittervane | Helen Hockx-Yu, British Library | |
2:30 pm | Break | Mumford |
Member Updates & Announcements | ||
Library of Congress Web Archives Update | Abbie Grotke & Nicholas Taylor, Library of Congress | |
HIVE for LC Web Archives: Web Archives and Automatic Subject Indexing | Rick Fitzgerald, Library of Congress; Craig Wills, UNC | |
International Digital Exchange Assessment (IDEA) | Megan Caverly, Library of Congress | |
Leveraging Web archives Research | Leïla Medjkoune, Internet Memory | |
Web Archiving in 2012 at National Diet Library | Masaki Shibata, National Diet Library | |
3:45 pm | Break | Mumford |
Challenges and Opportunities in the Absence of Legal Deposit: Web Harvesting for the US Government Printing Office and the US Federal Depository Library Program | David Walls, Government Printing Office | |
British Library Update | Helen Hockx-Yu, British Library | |
SCAPE Update | Barbara Sierman, National Library of the Netherlands | |
The Spanish Legal Deposit Law: knitting the web for digital resources | Mar Pérez Morillo, National Library of Spain | |
Havel Collection Update | Czech National Library | |
5:00 pm | Adjourn | |
6:30 pm | Reception | Great Hall |
Wednesday May 2, 2012 - Working Group meetings - IIPC members only | ||
Working Group | Presentations | |
Access Working Group | ||
Preservation Working Group | ||
Harvesting Working Group |
Biblioteca Alexandrina | |
Heritrix User Group | ||
Steering Committee meeting |
Thursday May 3 2012 - Workshops & Cross Working Group meetings - IIPC members and invited guests | ||
Working Group | Presentations | |
Web Lifecycle Management | Web Archiving ‘Lifecycles’ Workshop | |
Netarchive Workshop | Agenda | |
Legal Roundtable |
Agenda | |
Harvesting and Preserving the Future Web | Future of the Web Workshop
IIPC Future of the Web Workshop – Introduction & Overview (Draft) Los Alamos National Laboratory Research Library |
Friday May 4, 2012 - IIPC GA - Workshops -IIPC members and invited guests | ||
Workshops | Presentations | |
Crowdsourcing workshop | The Crowd & the Library | |
UDFR | Unified Digital Format Registry (UDFR) Understanding the System and Service | |
ISO workshop on metrics and quality |
Workshop on quality indicators New collections, new measures: metrics and quality indicators for web archives |
This agenda is a draft and subject to changes and updates.
Access Working Group
9am - 10:30am: Updates from AWG
member institutions, research initiatives, product/tool enhancements/demos, etc.
(e.g. QA module for WCT, Access2Preserve, launches, recent policy decisions,
legislation or other legal impacts, etc.)
10:30am BREAK
11am - 11:45am:
Memento Project discussion
11:45am - 12:30pm: Olympics 2012
Curation/Planning/Crawl
LUNCH 12:30-1:30pm
1:30pm-2:30pm AWG 'Birds of a
Feather' discussion sessions
Access Working Group members will meet as a
single group or in small groups to discuss key challenges and/or initiatives
etc. they need to/plan to tackle in the next year and identify any possible
areas they'd like IIPC help/involvement. It will be self organized as Helen
& Kris will be in the SC meeting.
2:30pm BREAK
AWG meetings conclude
for the day following the break.
Crowdsourcing workshop
Lead by Trevor Owens, Library of Congress
The web is a social platform, built by people and organizations for people and organizations. Web archives are, to all extent and purposes, no different. Yet the disparity between the number of people involved in developing the web, and the number of people involved in archiving the web, is enormous. This proposal seeks to investigate how crowdsourcing web archiving activities may begin to redress that balance and increase the amount of manpower available to throughout all stages of the web archiving workflow in member institutions.
Participation will be open to all members, though capped at 24.
Harvesting and Preserving the Future Web
Facilitated by
Kris Carpenter, Internet Archive
The Web's initial content model was the document. Great progress has been made in collecting and preserving this Web for future scholars. The Web is evolving to a content model that is a programming environment with services. This meeting's topic is the much more difficult problem of collecting and preserving this future Web. To frame ideas and thoughts, we are bringing together people working on various aspects of this, such as collecting AJAX and HTML5, synchronizing Web resources, and preserving Web services such as scientific workflows, together with institutions with an interest in preserving the future Web. The goal from this meeting is to begin to scope the challenges.
Harvesting Working Group
Status of the HWG (Kristinn
Sigurðsson)
Follow up on items from last meeting (Kristinn
Sigurðsson)
Browsers as crawlers (David Rapin)
DeDuplication (Youssef
Eldakar)
WARCs (Kristinn Sigurðsson)
Other topics, general discussion
Heritrix User Group
ISO workshop on metrics and quality
Lead by Dr. Clément
Oury, Bibliothèque nationale de France (National Library of France)
The goal of the workshop is to present and discuss the main outcomes of the report: the statistics and quality indicators chosen to evaluate collection development, collection characterization, collection access and usage, collection preservation, and web archiving costs. The workshop will provide as illustrations real-case institution examples: members of the WG will present how they gather and use the proposed indicators. This workshop is open to all kind of institutions. Attendees are encouraged to present their own experiences and bring examples of commonly used indicators within their institution.
Legal Roundtable
Web archivists meet lawyers: how can
legislation (or lack thereof) encourage or limit your web archiving program?
This roundtable will facilitate an open discussion between web archivists, expert practitioners and lawyers in order to discuss and compare the impact of international and national legislations and policies on web archiving activities. This discussion is open to all participants who have some background, or interest in, legal matters and want to raise practical or prospective questions as to the impact of legislation on their day-to-day work and advocacy efforts to promote web archiving. What are the current challenges and how can IIPC help?
Netarchive Workshop
NetarchiveSuite is a complete web
archiving open source software package. It gives the ability to prepare,
schedule, run and monitor harvests of websites. It also enables to perform
quality assurance and preserve harvested content. See more information on:http://netarchive.dk/suite/ and https://sbforge.org/jira/browse/NAS.
NetarchiveSuite is currently developed for production purposes and maintained by the NetarchiveSuite community, which includes the following institutions: Netarkivet.dk (State and University Library in Aarhus and The Royal Library in Copenhagen), Denmark, the National Library of France (BnF), and the National Library of Austria (ONB).
Preservation Working Group
Status updates of the current
work packages and activities. Will also include a demo of JHONAS-TO BE RECORDED
probably using webex.
- Discussion on strategies for preservation for web
archiving
- What is the current status on this topic in each
institution?
- Is there an existing example of a relevant obsolete file
format?
- How can we deal with preservation issues on basis of current used
tools and systems?
- Do we need a common pilot project?
New work packages
and activities
- What are planned activities in each institutions in this
area?
- In which areas would it be helpful to work together?
Steering Committee
Will be disseminated to members by
separate correspondence
Web Lifecycle Management
Will discuss the web archiving
lifecycle in terms of tools and workflow, how it is evolving and what that means
for our infrastructure and architectures. We will also discuss how spontaneous
archiving of world events affect the "typical" lifecycle for web archiving.
UDFR
The meeting is open to all interested members of the
preservation community; space is limited, however, and prior registration is
required. The meeting will include technical presentations on the UDFR
architecture and code walkthroughs of the major components of its open source
technology stack, including OntoWiki, RDFAuthor, Virtuoso, Zend, PHP, Apache,
and Noid; code walk-throughs; and a review of the four main ontological models:
OntoWiki system configuration, UDFR user profiles, UDFR class and property
ontology, and UDFR data.
Abigail Potter, IIPC Communications Officer
top | © 2004-2011 IIPC | copyright and privacy statements | credits |