Caption: HTML Version Usage Over Time, UK Web Archive
The formats captured in web archive collections can serve as a timeline for the development of web technologies. Analysis on captured websites can show changes in usage of file formats, programming languages, markup, and other attributes over time. These datasets show the rise and fall of various web formats, perhaps highlighting formats that need preservation attention or showing trends in markup and formatting.
"The dataset is a format profile, summarising the data formats (MIME types) contained within all of the HTTP 200 OK responses in the JISC UK Web Domain Dataset (19962010).”
“More and more websites have started to embed structured data describing products, people, organizations, places, events into their HTML pages using markup standards such as RDFa, Microdata and Microformats. The Web Data Commons project extracts this data from several billion web pages. The project provides the extracted data for download and publishes statistics about the deployment of the different formats.”