BIBLIOTECA NACIONAL DE ESPAÑA

National Library of Spain

Organization Type: National Library
Country: Spain
IIPC Contact Email: archivoweb@bne.es
www.bne.es

The Spanish Web Archive is maintained by the National Library of Spain with the collaboration of regional libraries. Its purpose is to become the most complete Spanish Web repository. It contains all web sites hosted on the Spanish domain .es and also Spanish web sites hosted elsewhere (.com; .org; .edu, etc.). The National Library of Spain joined the IIPC in 2010 and is part of the Steering Committee since 2013.


Archivo de la Web Española (Spanish Web Archive)

Start Date: 2009
Archive interface language(s): Spanish, English
Access methods: URL Search
Harvesting methods: National Domain, Selective, Event

The Spanish Web Archive is maintained by the National Library of Spain with the collaboration of regional libraries. Its purpose is to become the most complete Spanish Web repository. It contains all web sites hosted on the Spanish domain .es and also Spanish web sites hosted elsewhere (.com; .org; .edu, etc.). In order to collect the Spanish web as complete as possible, three different approaches have been followed:

1) Domain .es crawls, harvested annually from 2009 to 2013 in collaboration with Internet Archive, using Heritrix and Wayback Machine. Since 2016 one domain crawl per year will be run, using NetarchiveSuite.

2) Selective crawls since 2012. Since 2015 about 30 news media sites are being crawled every day.

3) Event crawls since 2011 on events of general interest at national level, such as general and local elections, royal transition, etc. Harvesting since 2009 under the general Legal Deposit Law. Since 26th October 2015 the Royal Decree regulating the legal deposit of online publications allows the National Library of Spain and the regional libraries to collect Spanish websites as part of the legal deposit and make them available to the public observing the terms of the copyright law. Our current collection amounts to >117 TB (December 2015). Not launched publicly yet. Access on-site is planned in the short-medium term.