Image

OPENING KEYNOTE PANEL

Sustainability for open source web archiving tools

Lauren Ko1, Tessa Walsh3, Neil Jefferies4, Clare Stanton5, Yves Maurer6, Olga Holownia2

1University of North Texas Libraries, United States of America; 2International Internet Preservation Consortium, United States of America; 3Webrecorder; 4Open Preservation Foundation, United Kingdom; 5Library Innovation Lab at Harvard Law School, United States of America; 6National Library of Luxembourg; 

For over 25 years, the critical infrastructure of web archiving – including core functions such as capture, indexing, and replay – has depended on open source tools. While the premise of open source code is vital for collaboration and the only option for many users to achieve their web archiving initiatives, maintaining and sustaining these tools remains a persistent challenge for the community, regardless of the scale of their archiving operations.

An ongoing survey of web archiving tool usage shows a majority of institutions dependent on open source products, including a replay tool initially released a dozen years ago and a crawler that has surpassed twenty years in age, despite the original project developers having shifted resources elsewhere. Aging codebases and unmaintained external dependencies, lead developers moving on from projects, shifts in web technologies that render older software less effective, and fewer funding opportunities available to open source projects are only some of the threats to the web archiving infrastructure employed by many institutions.

This panel will address the inherent sustainability challenges facing the foundational open source tools used by the web archiving and other communities. The goal is to move beyond stating the facts about resource limitations and instead focus on diagnosing the status quo, identifying shared issues, and proposing achievable collaborative solutions that can be implemented beginning in the near-term.

To that end, the panel will bring together a diverse range of stakeholders who offer different perspectives and potential solutions:

  • Consortia representatives who support open source ecosystems.
  • University and national libraries that rely on and contribute to these tools for large-scale web archiving.
  • Individual developers and service providers who build and maintain open source tools.

PERSPECTIVES

SUSTAINABLE DIGITAL PRESERVATION (CONSORTIA)

  • Importance of building technical communities rather than relying on single organizations
  • Organizations acting as facilitators and enablers for Open Source Communities
  • Governance and communications are as important as technical capabilities for community health
  • Facilitators can bridge technical deficits, but not a standard operating procedure
  • Role for membership and project-based funding
  • The European Cyber-Resiliency Act will impose some additional costs

UNIVERSITY AND NATIONAL LIBRARIES 

  • Inclusion of funding for open source tools in budget proposals
  • Allocation of time to contribute code to community-driven open source tools
  • Development of new projects with an open source release in mind
  • Collaborating on a model of shared stewardship
  • Bolstering community engagement to support software development
  • Use of open source software by national libraries
  • National library contributions to the open source landscape
  • Overcoming barriers that prevent national libraries from funding open source software

OPEN SOURCE TOOLS SERVICE PROVIDER 

  • A deeper definition of open source software
  • Examples of successful open source funding
  • Bringing external core maintainers to projects lacking support
  • Sources of funding from individual users, institutions, and downstream proprietary software companies to replace shrinking grant funding
  • Need for guidance in crafting public tenders that are inclusive of open source projects
  • Addressing costs associated with meeting institutions' increased security and accessibility requirements

PANELISTS 

Neil Jefferies | Open Preservation Foundation

Neil Jefferies is an Innovation Specialist at the Bodleian Libraries, Oxford, Executive Director of the Open Preservation Foundation and a Director of Data Futures GmbH. He is co-creator of the International Image Interoperability Framework and the Oxford Common File Layout for the preservation of complex digital objects. Most recently he has worked on Text Interoperability API's, AI tools for cataloguing and the EPUB/A ISO specification.  

Lauren Ko | University of North Texas Libraries

Lauren Ko leads the Software Development Unit in the University of North Texas Libraries Digital Libraries division where she has worked as a programmer analyst for several years building, deploying, and maintaining web applications. In parallel, she serves in the areas of creating, providing access to, and preserving web archives with open source tools. She is active in the IIPC tools community, hoping to make a better future for web archiving open source software and its developers.

Yves Maurer | National Library of Luxembourg

Yves Maurer is the head of IT and digital Innovation at the national library of Luxembourg (BnL). As a former head of digitization, Yves has been involved in the development and open-sourcing of the BnL’s digital collection viewer in 2011, the open publishing of the detailed digitization specifications for METS/ALTO and the quality assurance tool for digitization projects in 2014. The library continues using open source tools and providing updates to tools it is working on. Currently in web archiving, this is mostly Browserix and SOLRWayback and the ecosystem around them.

Clare Stanton | Library Innovation Lab at Harvard Law School

Clare Stanton is the Director of Product and Research at the Library Innovation Lab (LIL), a department of the Harvard Law School Library. The user-directed citation preservation service Perma.cc was built and is maintained at LIL, along with its associated open-source web archiving tools. Clare has been part of the Perma.cc team since 2018, when the IMLS awarded Perma a multi-year grant to prototype financial sustainability models. 

Tessa Walsh | Webrecorder

Tessa Walsh is the Senior Applications and Tools Engineer at Webrecorder, where she helps develop and maintain open source web archiving tools such as Browsertrix, Browsertrix Crawler, and pywb. The Webrecorder team has been developing open source web archiving tools for over 10 years, with a focus on making high-fidelity browser-based archiving tools accessible to anyone who needs to collect and preserve online content that is meaningful to them. In addition to being a software developer, Tessa is an archivist, a digital preservationist, and a musician.


CLOSING KEYNOTE PANEL

“Web Archiving for Accountability: New Frontiers in Open Source Intelligence and Digital Evidence”

Basile Simon1Emily Tripp2Marvin Milatz3Friedhelm Weinberg4

1Starling Lab; Stanford / USC; 2Airwars; 3Der Spiegel; 4Mnemonic

This panel convenes leading practitioners at the intersection of web archiving, open-source intelligence (OSINT), and human rights documentation to explore practices and use cases that are often under-represented at the IIPC. While traditional web archiving focuses on preserving cultural heritage, a growing community of investigators, journalists, and legal advocates utilizes web archiving as a critical tool for accountability. This panel will present the methodologies and challenges of this mature field, showcasing how web archives are used to document human rights abuses, investigate war crimes, and combat disinformation.

Our contribution advances the conference theme by broadening the definition of "web heritage" to include the ephemeral, dynamic, and often contentious digital content that serves as evidence of historical events. While discussions at previous conferences have at times centered on the technical and logistical challenges of large-scale crawls and collection development, our panel shifts the focus to the high-stakes application of web archives in legal and journalistic contexts. We will build upon prior work in areas like high-fidelity capture and consider the unique ethical and security challenges that arise when archiving evidence of state-sponsored violence or disinformation campaigns. By bridging the gap between the cultural heritage and accountability-focused communities, we aim to create a more holistic understanding of the web’s role as a primary source for future generations.

The impact of this panel on conference attendees will be threefold. First, archivists and librarians from traditional memory institutions will gain insight into a rapidly evolving use case for their skills and infrastructure, opening potential avenues for collaboration with human rights organizations and newsrooms. Second, tool developers and engineers will hear directly from practitioners in high-risk environments about their specific needs for verifiability, security, and ease of use, informing the next generation of web archiving technology. Finally, researchers and legal experts will be exposed to the practical realities and evidentiary standards required to bring web-archived material into judicial proceedings, fostering a richer dialogue between technologists and the legal community.

We aim to foster a discussion on similarities and differences in cultural heritage archiving and "accountability archiving" practices to share knowledge, identify common challenges, and explore opportunities for collaboration. By highlighting the needs of investigators and OSINT practitioners, we can collectively advance the development of tools and standards that serve a broader range of applications and strengthen the integrity of the web as a historical record.

PANELISTS

Basile Simon | Starling Lab, Stanford / USC

Basile is the director of the law program and special projects at the Starling Lab (Stanford / USC), and a fellow at Stanford. He leads applied research in evidentiary and investigative standards and comes from a background in journalism and human rights documentation. In both settings he collected, managed, designed, or leveraged digital collections including of web archives material.



Emily Tripp | Airwars

Emily is the executive director of Airwars, a leading NGO documenting civilian harm in global conflicts. With a background in humanitarian response across the Middle East, she leads Airwars in its pioneering use of open-source data to hold military actors accountable. Her work focuses on preserving and leveraging ephemeral conflict data to ensure transparency and justice for victims.



Marvin Milatz | Der Spiegel

Marvin works as a researcher at DER SPIEGEL’s fact-checking unit, specialising in “Open Source Intelligence” (OSINT) and digital forensics. In his daily work he stitches together digitally saved information from a multitude of sources, e.g. for person of interest investigations or company data research. His special trick concerning digital archives involves batch downloading scores of snapshots and searching them for unique identifiers with code..



Friedhelm Weinberg | Mnemonic

Friedhelm oversees the programmatic work of Mnemonic, an organisation dedicated to archiving, preserving, and verifying open-source information as documentation of human rights violations and international crimes. After beginning his career as a journalist, he spent the last 15 years working on human rights documentation methodologies, developing open source software tools and co-creating organisational security approaches. Before joining Mnemonic in 2024, he served as Executive Director of HURIDOCS.