KANSALLISKIRJASTO

KANSALLISKIRJASTONational Library of Finland

Organization Type: National Library
Country: Finland
IIPC Contact Email: vapaakappale@helsinki.fi
www.kansalliskirjasto.fi
Founding member


Kansalliskirjaston verkkoarkisto (Finnish Web Archive)

Start date: 2006
Archive interface languages: Finnish, Swedish, English
Access methods: URL search, Full-text search
Harvesting methods: National domain, Regional domain, Event, Thematic

As a part of its legal deposit duties, The National Library of Finland collects web pages from web servers that

  • have *.fi or *.ax domain names,
  • are located in Finland, or
  • contain subject matter that is targeted to the Finnish public.

The policy is to create a representative sample of web contents over time and subjects.

Web archive was launched in 2006. By 2015, the size of the web archive was over 80 TB (compressed).

The contents of the web archive may be accessed only from dedicated workstations that are available in selected libraries in addition to the National Library. Everyone has access to the archive but only print-out copies of the contents are allowed.

Annually The National Library of Finland collects representative sample of webpages from webservers 1) either having fi- or ax-domain names, 2) residing physically within Finland, or 3) containing material that is targeted for Finnish public. The policy is to create a representative sample of web contents over time and subjects. Domain crawls are supplemented by theme and event based harvesting. Contents of Finnish newspapers and news sites is harvested on a daily basis. The library may request a web publisher to give access to its web harvester (behind paywalls etc), or to deposit its web publications, when harvesting is not possible.

Access: The contents of the archive can be only accessed from special legal deposit workstations that are available in selected libraries within Finland (including The National Library of Finland). Anyone can use the archive but digital copying of material from the archive is prohibited.