The Internet Archive is a non-profit organization that is compiling a historic database of Web sites and other digital content. IA's web archives now exceed 5PBs of data (compressed) and encompass over 370 billion captures collected from 1996 to the present culled from every domain, over 200 million web sites and 100+ languages. This archival database expands with the Internet, and so grows by ~200TBs (compressed) every month. Usage of IA's web collections via the Wayback machine supports up to 10 thousand requests per second.
- Start date: 1996
- Archive interface language: English
- Access methods: URL search, Topical collections
- Harvesting methods: Web-wide (exhaustive), Top Level Domain/National/Regional Domain, Bulk, Survey, Selective, Event, Thematic, End of Life, Site-Specific