As large bodies of websites are captured, so are the links and connections between them. These networks of linked sites and data can be mined to observe the relationships between individuals, organizations, and ideas over time. Just as this kind of analysis is done on websites and social networks on the live Web, it can be used with web archive datasets to view changes over time or at points in the past.
This visualisation shows an overview of how a subset of the sites in the JISC UK Web Domain Dataset (1996-2010) are interlinked. For each year, the corresponding chord diagram shows the percentage of links between the different second-level or top-level domains, such as the percentage of links found in *.ac.uk pages that link to *.co.uk pages.
Project using Common Crawl - captured websites to discover and visualize links between websites in different languages.
Analysis of Common Crawl - captured website linkages to Facebook.
This case study is part of a Web Archiving Use Cases report written by Emily Reynolds for the Library of Congress.