IIPC RSS Webinar: Researching web archives with Solrwayback 4

IIPC Research Speaker Series (RSS) focuses on the research use of web archives and features presentations of use cases, collaborative projects and new tools for researchers. During this webinar, Thomas Egense, Toke Eskildsen, Anders Klindt Myrvoll, Jesper Lauridsen, and Jørn Thøgersen of the Royal Danish Library, will walk you through SolrWayback (version 4.0 was released on 20 December). SolrWayback, a fusion of discovery (Solr) and playback (Wayback) functionality, makes it easier for researchers, curators and other web archive users to explore harvested web content on their own laptops.

SolrWayback 4, which now runs faster, has a redesigned interface with easier navigation, including content sensitive help, and more functionality (e.g. you can see the WARC header for a single post). The search field has been reworked in order to make large and complex queries much easier to manage. Other key features include searching with an uploaded file, through the Ngram interface, as well as word cloud generator and link graph exports.

Netarkivet, the national Danish web archive, started using SolrWayback as their new improved frontend mid-February 2021. SolrWayback has also been used by the Hungarian Web Archive at the National Széchényi Library and the Web Archiving Team at the Bibliotheca Alexandrina has also been testing SolrWayback on the CDG Covid-19 collection and we will share the preliminary results during the webinar.


Anders Klindt Myrvoll is the Programme Manager at the national Danish web archive, Netarkivet, at the Royal Danish Library. Together with colleagues, he is collecting, preserving and providing access to the Danish web. Prior to web archiving Anders worked more than 13 years in management and production in the film and media industry, collaborating globally on everything from high end localization to original content, and along the way also gaining extensive experience in digitization and preservation of cultural heritage. You can find him at https://www.linkedin.com/in/andersklindt/ or @andersklindt on Twitter

Jesper Lauridsen is a frontend developer, focused mainly on usability and, to some degree, the user experience of the products of the Royal Danish Library. In this particular project, he’s worked with Jørn Thøgersen to create the frontend architecture for the underlying SolrWayback services. You can find him on twitter as @justjspr, ranting about everything from football to coding bugs. 

Jørn Thøgersen is lead frontend developer at the Royal Danish Library. During the past 14 years he has worked on many major web applications centered around various sides of cultural heritage. In the SolrWayback project he layed out the technical tracks for the frontend in close collaboration with Jesper Lauridsen. Aside from developing for the web he has a great passion for DIY projects and power tools. You can find him at  https://www.linkedin.com/in/j%C3%B8rn-th%C3%B8gersen-50b271/ or @jorntx on Twitter.

Thomas Egense is lead developer on SolrWayback. I work as a Java backend programmer on several projects for the Royal Danish Library where I have worked for the last 9 years. SolrWayback and projects involving AI/NLP are my favorite projects.  In my spare time I am making mathematical art with my own software and I have a few other github projects going as well. If you have any questions on SolrWayback you are always welcome to email me at thomas.egense@gmail.com

Agenda:

SolrWayback 4: Netarkivet, at the Royal Danish Library  www.kb.dk/netarkivet

  • Anders Klindt Myrvoll: SolrWayback and Netarkivet 
  • Toke Eskildsen: webarchive_discovery & solr 
  • Thomas Egense: Demo 
  • Jesper Lauridsen & Jørn Thøgersen: frontend developing

Publishing the IIPC Covid-19 Collection Using SolrWayback: Bibliotheca Alexandrina

  • Youssef Eldakar & Mohamed Elsayed

The presentation will be followed by a Q&A session chaired by Ben O’Brien, National Library of New Zealand and Peter Stirling, BnF.



 

The event is finished.

Date

10 Mar 2021
Expired!

Time

UTC
7:30 AM - 8:30 AM

Local Time

  • Timezone: America/New_York
  • Date: 10 Mar 2021
  • Time: 2:30 AM - 3:30 AM

More Info

Registration
Category

Next Event

© Copyright 2024 International Internet Preservation Consortium