14:30 – 14:50

Tobias Beinert, Markus Eckl & Florence Reiter: Archiving and analysing elections: how can web archiving, digital humanities and political science go together?

Tobias Beinert, Bavarian State Library
Markus Eckl, University of Passau, Department of Digital Humanities
Florence Reiter, University of Passau, Jean Monnet Chair for European Politics

The Bavarian State Library has been running selective web archiving activities since 2011, however the academic use of the archived objects is still constrained to a reading-based approach. Additionally the collection development for the web archive is up to now based on the decisions of the library’s staff.

To overcome these shortcomings the Bavarian State Library has teamed up with experts from the Chair of Digital Humanities and researchers of political science from the Jean Monnet Chair at the University of Passau. In a joint study methods and software from the Digital Humanities as well as experimental tools developed in the Web Archiving Community will be applied to datasets of web archive collections. Hereby the focus lies on testing innovative and intuitive ways of accessing web-based resources and implementing approaches for an automated and user-based collection development. To prove that those methods and instruments are useful in academic settings, a case study on the Bavarian state election (2018) as well as the European Parliament election (2019) is conducted. The case study explores, if and how web archives can empirically answer the question how political actors and parties frame the European Union throughout their election campaigns, i.e. in which regard issues are labelled as “european” or “national”. For studying the EU, the framing perspective is particularly relevant. According to Jörg Matthes frames can be defined as “selective views on issues – views that construct reality in a certain way leading to different evaluations and recommendations”, so the concept is useful when dealing with political communication.

The presentation gives a first insight in the challenges of an election event crawl and the first steps of preparing and analysing the produced data. It will illustrate and evaluate the tools (Web Curator Tool/Heritrix, webrecorder) used for crawling different sources (websites, social media, news sites) and data analysis. The perceived gap between crawling the web as a means of standard collection development in a library and producing data sets for specific research purposes will also be addressed as it can help to lay the basis for the provision of a scientific theoretical framework for web archiving. The paper therefore not only discusses the results of the case study, but also addresses methodological questions along the research process. It thereby focuses on the interplay of specific disciplinary questions and requirements of libraries, political science, and digital humanities methods.

We thus aim to contribute both to the debate on the scientific value of web archives in general as well as the question which methods are a suitable use for research in web archives.

Researcher case studies Use, usability and access to web archives and web archive datasets

14:50 – 15:10

Lynda Clark: Emerging formats: discovering and collecting contemporary British interactive fiction

ORCID: 0000-0001-7253-4587

The British Library’s ongoing Emerging Formats project seeks to identify, collect, preserve and describe complex digital works in order to ensure they remain accessible for future researchers and readers. This case study is concerned with one type of Emerging Format: web-based interactive fiction. Therefore, this talk considers the suitability of the British Library’s ACT web archiving tool and Rhizome’s Webrecorder as a means to capture the digital interactive works created across the UK and suggests further steps for ensuring such works are retained for future researchers and readers.

In order to create a collection of web-based interactive works accessible via the Library’s UK Web Archive, it was first necessary to define parameters for identifying relevant works. This immediately highlighted the challenges relating to categorising and analysing such varied and complex material. Digital interactive fiction, like many digital technologies, is an area of rapid growth and change. New tools and sharing platforms for writers and readers of interactive fiction emerge, while others become obsolete, or are lost altogether. In July 2017, Adobe announced an end to updates and distribution of the interactive content creation tool Flash at the end of 2020. In August 2018, Inkle’s Inklewriter was officially shut down (although for now remains online). The Interactive Fiction Database offers a useful catalogue of works of all kinds, but as a community-run resource, the amount of information provided for each entry varies wildly, with some offering merely a record of the works’ existence, but no means to play or view it. SubQ remains the only major paying online magazine publishing interactive work.

Since ‘[n]ot all groups or individuals creating publications define themselves as “publishers” and may not view their work in terms of a “publication”’[1] there is very little standardisation in terms of production processes and release of complete works. Creators often use creation tools in unusual ways and, in seeking to subvert genre or format expectations, further complicate matters for archivists, researchers and readers alike. However, a dual web archiving tool approach seems well suited to facing at least some of these challenges.

[1] Caylin Smith and Ian Cooke, ‘Emerging Formats: Complex Digital Media and Its Impact on the UK Legal Deposit Libraries’, Alexandria: The Journal of the National and International Library of Information Issues, 27.3 (2017), 175–87 <https://doi.org/10.1177>, p. 176.

15:10 – 15:30

Radovan Vrana & Inge Rudomino: Croatian web portals: from obscurity to maturity

Radovan Vrana, Faculty of Humanities and Social Sciences, University of Zagreb
Inge Rudomino, National and University Library in Zagreb

Over two and half decades have passed since the introduction of the Internet in Croatia and since the country’s top-level domain .hr came to life. The very first Croatian web sites within the .hr top-level domain were those of the several Croatian academic institutions as very few other institutions or individuals had access to the internet or had a web site at that time. This situation with the number of newly published web sites improved soon as the web has become more available to many people serving as the publishing platform. This was also the era in which first Croatian portals appeared. Approximately at the same time when the first Croatian portals appeared in 1997, the Law on Libraries in Croatia was passed. It introduced a new amendment regarding the legal deposit provision to include online publications like web portals into the legal deposit. The Law on Libraries was also the basis for the development of the Croatian Web Archive. The Croatian Web Archive is joint project of the National and University Library in Zagreb and the University Computing Centre, University of Zagreb. Its tasks are to collect, store and give access to online resources. The content of the Croatian Web Archive is harvested daily, weekly, monthly, annually, etc. Additionally, annual harvestings of the top-level domain (.hr) and thematic harvestings of the important events in Croatia are conducted throughout the year. The harvestings also includes web portals, the most dynamic form of web sites changing from one harvesting event to another.

Our focus will be on the web portals as the most dynamic form of web sites and changes they have gone through over time that can be observed in the Croatian Web archive. The analyses will show changes in design, content layout, URLs, titles etc. with the aim to establish the major development phases the Croatian web portals have undergone.

15:30 – 15:40

Q&A

Collecting themes and formats