WAC 2021: Workshop: SolrWayback 1
Run your own full stack SolrWayback
Thomas Egense & Toke Eskildsen
The Royal Danish Library
Description:
This workshop will
- Explain the ecosystem for SolrWayback 4 (https://github.com/netarchivesuite/solrwayback)
- Perform a walkthrough of installing and running the SolrWayback bundle. Participants are invited to mirror the process on their own computer and there will be time for solving installation problems
- Leave participants with a fully working stack for index, discovery and playback of WARC files
- End with open discussion of SolrWayback configuration and features.
Prerequisites:
- Participants should have a Linux, Mac or Windows computer with Java 8 or Java 11 installed.
- Downloading the latest Release from https://github.com/netarchivesuite/solrwayback beforehand is recommended: the releases can be found at the right side of the GitHub page.
- Having institutional WARC files available is a plus, but sample files can be downloaded from https://archive.org/download/testWARCfiles
Target audience:
Researchers with medium knowledge of web archiving and tools for exploring web archives. Basic technical knowledge of starting a program from the command line is required; the SolrWayback bundle is designed for easy deployment.
Background:
SolrWayback 4 (https://github.com/netarchivesuite/solrwayback) is a major rewrite with a strong focus on improving usability. It provides real time full text search, discovery, statistics extraction & visualisation, data export and playback of webarchive material. SolrWayback uses Solr (https://solr.apache.org/) as the underlying search engine. The index is populated using Web Archive Discovery (https://github.com/ukwa/webarchive-discovery). The full stack is open source and freely available. A live demo is available at http://webadmin.oszk.hu/solrwayback/
During the conference there will be focused support for SolrWayback in a dedicated Slack channel by Thomas Egense and Toke Eskildsen.