The Internet Archive is a non-profit organization that is compiling a historic database of Web sites and other digital content. IA's web archives now exceed 2PBs of data (compressed) and encompass over 150billion captures collected from 1996 to the present culled from every domain, over 200million web sites and 40+ languages. This archival database expands with the Internet, and so grows by nearly 100TBs (compressed) every month. Usage of IA?s web collections via the Wayback machine average 400-500 requests per second.
Recently, the Internet Archive invested in Sun's open storage server and software technologies, specifically a Sun Modular Datacenter (Sun MD), installed at Sun's Santa Clara campus, supported by the Sun MD remote monitoring service.
The new Sun MD was installed in March 2009. It is equipped with 60 Sun Fire X4500 (Thumper) Open Storage Systems that run the Solaris 10 OS, including the Solaris ZFS file system. Sun's servers with Solaris ZFS storage pools enabled the Internet Archive to double the storage capacity of its old system while using up to 50 percent less power than other servers would use.
Sun engineers monitor power, heating and cooling, fire, smoke, and water detection, and physical access points, and dispatch repair technicians, if necessary. IA Engineers manage the repository software, archival data, and access services provided to researchers and the general public.
- Start date: 1996
- Archive interface language: English
- Access methods: URL search, Topical collections
- Harvesting methods: National Domain, Regional Domain, Bulk, Selective, Event, Thematic