#WhyWebArchiving: Preserving Internet Content for Research Use
Monday, May 23rd, 6:00-7:15 pm EST
The Library of Congress is one of many organizations around the globe that are selecting and preserving web content to enable future research use. While the field of web archiving has been gaining momentum for over a quarter of a century, the creation and use of web archives may not be as familiar to visitors and users of library collections. This panel brings together library subject experts, researchers and historians to discuss the value of web archiving and potential uses for content collected and preserved by cultural heritage institutions. This event is presented as a part of the Library of Congress and the International Internet Preservation Consortium’s 2022 Web Archiving Conference.
Registration is open to all for this public event. Video from this event will be streamed and available on Youtube for 24 hours after the event.
Host/Welcome: Abbie Grotke, Assistant Head, Digital Content Management Section (Web Archiving Program), Library of Congress
Moderator: Ian Milligan, Associate Professor of History and Associate Vice-President, Research Oversight and Analysis, University of Waterloo
- Jennifer (JJ) Harbster, Head, Science Reference Section in the Library of Congress Science, Technology and Business Division
- Elizabeth (Beth) Osborne, Senior Legal Reference Librarian at the Law Library of Congress
- Benjamin Lee, Ph.D. candidate in the Paul G. Allen School for Computer Science & Engineering at the University of Washington
- Amelia Acker, Assistant Professor and Director, Critical Data Studies Lab, The University of Texas at Austin | School of Information
Ian Milligan is Associate Professor of History and Associate Vice-President, Research Oversight and Analysis at the University of Waterloo. Milligan’s primary research focus is on how historians can use web archives, as well as the impact of digital sources on historical practice more generally.
Jennifer (JJ) Harbster is head of the Science Reference Section in the Library of Congress Science, Technology and Business Division. She leads an amazing group of librarians to develop the Library’s print and digital collections, provide reference services, create research products, and develop programs.
Elizabeth (Beth) Osborne is a Senior Legal Reference Librarian at the Law Library of Congress. She provides research, reference, and instructional services, and serves as a Recommending Officer and web archives collection leader in the subject area of United States law.
Ben Lee is a fourth year Ph.D. candidate in the Paul G. Allen School for Computer Science & Engineering at the University of Washington, where he is a National Science Foundation Graduate Research Fellow in Machine Learning. He recently served as a 2020 Innovator in Residence at the Library of Congress. His research focus is on developing novel exploratory search systems for cultural heritage collections.
Dr. Amelia Acker is an assistant professor at the University of Texas at Austin in the School of Information, where she leads the Critical Data Studies Lab. Her research on data archives and preservation has been funded by the National Science Foundation and the Institute for Museum and Library Services. Acker’s current research focuses on cultures of mobile computing, emerging digital preservation models from platforms, data literacy, social media data for research, and metadata standards for exchange between private and public archives. Previously, Acker worked as a librarian, an archivist, and a mobile app developer. When she’s home in Austin, you can find her biking, bouldering, or swimming around town.
- Science Blog Web Archive: https://www.loc.gov/collections/science-blogs-web-archive/about-this-collection
- Coronavirus Web Archive: https://www.loc.gov/collections/coronavirus-web-archive/about-this-collection
- Earth Day 2020 Web Archive: https://www.loc.gov/collections/earth-day-2020-web-archive/about-this-collection
- Newspaper Navigator search application: https://news-navigator.labs.loc.gov/search
- Library of Congress 1,000 Government PDF dataset: https://labs.loc.gov/work/experiments/webarchive-datasets/
- Grappling with the scale of born-digital government publications: Toward pipelines for processing and searching millions of PDFs: https://doi.org/10.1007/s42803-022-00042-x
- Meme Generator: https://memegenerator.net/
- Open Refine data software: https://openrefine.org/
- Meme Generator Web Archive: https://www.loc.gov/item/2018655320/
- Research article on the archived meme project: http://www.ameliaacker.com/wp-content/uploads/2022/05/Neil-Degrasse-Tyson-Problem.pdf
- LOC’s Web Cultures Web Archive: https://www.loc.gov/collections/web-cultures-web-archive/about-this-collection/
Library of Congress: web archiving, digital collections strategy, LC Labs & datasets
Rescue data projects
- Environmental Data and Governance Initiative: https://envirodatagov.org
- Saving Ukrainian Cultural Heritage Online: https://www.sucho.org
Save-page-now & Wayback Machine browser add-ons
- Safari: https://apps.apple.com/us/app/wayback-machine/id1472432422?mt=12
- Chrome: https://chrome.google.com/webstore/detail/wayback-machine/fpnmgdkabkmnadcjpehmlllkndpkmiak?hl=en-US
- Firefox: https://addons.mozilla.org/en-US/firefox/addon/wayback-machine_new
- Safari: https://apps.apple.com/us/app/wayback-machine/id1472432422
- MS Edge: https://microsoftedge.microsoft.com/addons/detail/wayback-machine/kjmickeoogghaimmomagaghnogelpcpn?hl=en-US
Tools, training materials, projects
- https://github.com/netarchivesuite/solrwayback (Live demo: https://webadmin.oszk.hu/solrwayback)