Support for transitioning to pywb

 

Project leads:
IIPC Tools Development Portfolio


Project lead & developer:
Ilya Kreymer, Webrecorder.net

Funding:
30,000 EUR

Resources:

GitHub repository
Download
Documentation
Migration guide
Updates (IIPC blog)


Brief description of the project

Support for transitioning to next generation web archive replay tools with Webrecorder pywb.

Package A

Detailed Migration Guide for OpenWayback/IIPC Members

  • Detailed docs on how to transition to pywb, including examples
  • Cover CDX conversion and configuration
  • Cover Terminology differences
  • Ensure ZipNum, PathIndex, WARC Configuration options covered
  • Ensure OutbackCDX is covered
  • Loop back with Tools committee on progress
  • If necessary, address any incompatibilities or missing features to ensure active OpenWayback configurations are supported (eg. path index, zipnum fully working as expected)

Migration guide

Package B

Updating pywb architecture to support a more component-based setup:

  • Develop interoperable index API component, implemented in pywb, but to allow swapping in OutbackCDX later, coordinating with Alex Osborne/ National Library of Australia
  • Develop interoperable warc store API, initially implemented in pywb, but to allow swapping in nginx/apache later
  • Support toggling pywb to use client-side replay using the index and warc store APIs.

Embargo/Access Control System

  • Ensure access control system is working with following embargo rules: embargo before date, embargo after date, and embargo newer than X or older than X.
  • Documentation on full access control system (exclude vs block), customizing exclusions error messages

Multilingual Support/Guide

  • Ensure localization system is fully working
  • Ensure all localizable text in pywb as marked as text
  • Document localization system workflow, including how to add localizable text, what to send to translators, etc.

Location + User Name Access Controls

  • Support access control rules based on specified header, from Apache/Nginx
  • Provide documentation examples of configuring Apache/Nginx location or user name restrictions
  • If necessary, implement location check in pywb as well.

Package C

Styling/Branding Guide

  • Add documentation on all UI templates with examples
  • Ensure all template variables documented and working

Integration and Embedding Guide

  • Additional documentation for adding metadata and external sources
  • Ensure pywb can be embedded in other applications as a frame, and provide examples

Calendar and Banner UI Improvement

  • Improved Nav on Banner, back-forward
  • Remember previous position when returning from banner
  • Research and implement calendar display to better capture distribution and spread
  • Documentation for customizing banner, simple (logo) and more complex (custom UI)

Schedule

  • Package A: by September 2020
  • Package B: by June 2021
  • Package C: by August 2021