Member Navigation

  • International
    I

    A global network of experts archiving the Web for future generations.

    Learn more about IIPC

  • Internet
    I

    The web is a unique and dynamic resource that is of high value to current and future researchers.

    Learn about the value of our work

  • Preservation
    P

    IIPC members archive the web on a local, national, and global scale.

    Browse our members' archives

  • Consortium
    C

    Our community comes together annually to share experiences and present solutions.

    Meet IIPC's member organizations

  • Web Archiving Bibliography

    Analysing the Impact of File Formats on Data Integrity. (2008) 

    Authors: Heydegger, V. 
    Publication: Proceedings of Archiving 2008 
    Pages: 50-55 
    Date: 0 0 2008 
    Location: Bern, Switzerland 
    Type: Conference proceedings 
    URL: http://old.hki.uni-koeln.de/people/herrmann/forschung/heydegger_archiving2008_40.pdf 

    Archival preservation of web resources: HTML to XHTML Migration Test Technical Considerations, Evaluation, and Recommendations. (2001) 

    Authors: Dollar Consulting 
    Date: 0 0 2001 
    Publisher: Smithsonian Institution Archives 
    Type: Web document 
    URL: http://siarchives.si.edu/pdf/dollarrpt2.pdf 

    Archiving the Czech Web: Issues and Challenges. (2003) 

    Authors: Zabicka, P. 
    Publication: Presented at the 3rd ECDL Workshop on Web Archives. 
    Date: 0 0 2003 
    Location: Trondheim, Norway 
    Type: Conference proceedings 
    URL: http://bibnum.bnf.fr/ecdl/2003/proceedings.php?f=zabicka 

    Archiving the Deep Web. (2002) 

    Authors: Masanes, J. 
    Publication: 2nd ECDL Workshop on Web Archiving. 
    Date: 0 0 2002 
    Location: Rome, Italy. 
    Type: Conference proceedings 
    URL: http://bibnum.bnf.fr/ecdl/2002/BnF/BnF.html 

    Archiving the World Wide Web, Building a National Strategy for Digital Preservation: Issues in Digital Media Publishing. (2002) 

    Authors: Lyman, P. 
    Date: 0 0 2002 
    Publisher: Council on Library and Information Resources and the Library of Congress. 
    Type: Web document 
    URL: http://www.clir.org/pubs/reports/pub106/web.html 

    The Availability and Persistence of Web References in D-Lib Magazine. (2005) 

    Authors: McCown, F. 
    Publication: 5th International Web Archiving Workshop (IWAW05) 
    Date: 0 0 2005 
    Location: Vienna 
    Type: Conference proceedings 
    URL: http://iwaw.europarchive.org/05/papers/iwaw05-mccown1.pdf 

    Block-level Link Analysis (2004) 

    Authors: Wen, J., Ma, W., Cai, D. & He, X. 
    Publication: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval 
    Pages: 440-447 
    Date: 7 25 2004 
    Location: Sheffield, United Kingdom 
    Type: Conference proceedings 
    Archived URL: http://web.archive.org/web/20130810075517/http://research.microsoft.com/en-us/um/people/jrwen/jrwen_files/publications/block-level%20link%20analysis.pdf 
    DOI: 10.1145/1008992.1009068 

    Catch me if you can: Visual Analysis of Coherence Defects in Web Archiving. (2009) 

    Authors: Spaniol, M., Mazeika, A., Denev, D. & Weikum, G. 
    Publication: Proceedings of 9th International Web Archiving Workshop 
    Pages: 27-37 
    Date: 0 0 2009 
    Location: Corfu 
    Type: Conference proceedings 
    URL: http://www.iwaw.net/09/IWAW2009.pdf 

    Collecting and preserving the World Wide Web: A feasibility study undertaken for the JISC and Wellcome Trust (2003) 

    Authors: Day M. 
    Date: 0 0 2003 
    Publisher: Wellcome Library 
    Type: Web document 
    URL: http://www.jisc.ac.uk/uploaded_documents/archiving_feasibility.pdf 

    Considerations for the Preservation of Blogs, Digital Preservation Europe briefing paper. (2009) 

    Authors: Hank, C., Sheble, L. & Choemprayong, S. 
    Date: 0 0 2009 
    Type: Web document 
    URL: http://www.digitalpreservationeurope.eu/publications/briefs/preservartion_blogs.pdf 

    The Continuing Metamorphosis of the Web (2009) 

    Authors: Spector, A. Z. 
    Publication: Keynote at WWW2009 
    Date: 4 24 2009 
    Publisher: Google, Inc. 
    Location: Madrid 
    Type: Conference proceedings 
    URL: http://www2009.eprints.org/214/1/www2009azsv4FinalV3.pdf 

    Data Management Projects at Google (2008) 

    Authors: Cafarella, M., Chang, E., Fikes, A., Halevy, A., Hsieh, W., Lerner, A., Madhavan, J. & Muthukrishnan, S. 
    Publication: SIGMOD Record 
    Volume: 37 
    Issue: 1 
    Pages: 34-38 
    Date: 0 0 2008 
    Type: Journal 
    URL: http://turing.cs.washington.edu/papers/dataprojects-google-sigmodrecord08.pdf 

    Data Quality in Web Archiving. (2009) 

    Authors: Spaniol, M., Mazeika, A., Denev, D., Weikum, G. & Senellart, P. 
    Publication: Proceedings of WICOW 
    Pages: 19 - 26 
    Date: 0 0 2009 
    Publisher: ACM Press 
    Type: Conference proceedings 
    URL: http://liwa-project.eu/images/publications/p19-spaniolA.pdf 

    Digital Preservation Strategy (2006) 

    The Discoverability of the Web. (2007) 

    Authors: Dasgupta, A., Ghosh, A., Kumar, R., Olston, C., Pandey, S. & Tomkins, A. 
    Publication: Proceedings of the Sixteenth International World Wide Web Conference. 
    Pages: 421-430 
    Date: 0 0 2007 
    Location: Alberta, Canada 
    Type: Conference proceedings 
    URL: http://www2007.org/papers/paper592.pdf 

    Harvesting the Swedish web space (2001) 

    Authors: Arvidson, A. 
    Publication: Preserving online content for future generation: ECDL Workshop 
    Date: 0 0 2001 
    Location: Darmstadt, Germany 
    Type: Conference proceedings 
    URL: http://bibnum.bnf.fr/ecdl/2001/sweden/sld001.htm 

    IRIS Research: VRC Risk Management Resources. (2002) 

    Authors: Kenney, A. & McGovern, N. 
    Date: 0 0 2002 
    Publisher: Cornell University Library 
    Type: Web page 
    Archived URL: http://web.archive.org/web/20100823132834/http://irisresearch.library.cornell.edu/VRC/riskresources.html 

    Learning Block Importance Models for Web Pages (2004) 

    Authors: Song, R., Liu, H., Wen, J. & Ma, W. 
    Publication: Proc. 13th World Wide Web Conference 
    Date: 0 0 2004 
    Location: New York 
    Type: Conference proceedings 
    Archived URL: http://web.archive.org/web/20090220074359/http://research.microsoft.com/en-us/um/people/jrwen/jrwen_files/publications/BlockImportance.PDF 

    Legal issues relating to the archiving of Internet resources in the UK, EU, USA and Australia. (2003) 

    Authors: Charlesworth, A. 
    Publication: Version 1.0. Study for JISC and Wellcome Trust. 
    Date: 0 0 2003 
    Type: Article 
    URL: http://www.jisc.ac.uk/uploaded_documents/archiving_legal.pdf 

    The Long-Term Preservation of Web Content. (2006) 

    Authors: Day M. & Masanes, J. 
    Publication: Web Archiving 
    Pages: 177-199 
    Date: 0 0 2006 
    Publisher: Springer-Verlag 
    Location: Berlin 
    Type: Book 

    Managing duplicates in a web archive. (2006) 

    Authors: Gomes, D., Santos, A. & Silva, M. 
    Publication: In Liebrock, L. (Ed.), Proceedings of the 21th Annual ACM Symposium on Applied Computing (ACM-SAC-06), Dijon, France. 
    Date: 0 0 2006 
    Type: Conference proceedings 
    URL: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.64.4047&rep=rep1&type=pdf 

    Modeling Object Characteristics of Dynamic Web Content (2003) 

    Authors: Shu, W., Collins, E. & Karamcheti, V. 
    Publication: Journal of Parallel and Distributed Computing 
    Volume: 63 
    Issue: 10 
    Pages: 963-980 
    Date: 10 0 2003 
    Type: Article 
    URL: http://www.cs.wayne.edu/~weisong/papers/jpdc03.pdf 

    NutchWAX Multilingualization 

    Authors: ASAHARA , Masayuki (in collaboration with National Diet Library, Japan) 
    Type: Web page 
    URL: 
    NutchWAX-0.12.9 Japanization 
    https://sites.google.com/site/masayua/m/nutch/nutchwax/nutchwax-0129-ja2 
    NutchWAX-0.12.9 Chinezation 
    https://sites.google.com/site/masayua/m/nutch/nutchwax/nutchwax-0129-zh2 
    NutchWAX-0.12.9 Koreanization 
    https://sites.google.com/site/masayua/m/nutch/nutchwax/nutchwax-0129-kr2 

     
     

    The POWR Handbook: Preservation of Web Resources Handbook. (2008) 

    Authors: JISC 
    Date: 0 0 2008 
    Publisher: University of London Computing Center. 
    Type: Article 
    URL: http://jiscpowr.jiscinvolve.org/files/2008/11/powrhandbookv1.pdf 

    The Practice and perception of Web Archiving in Academic Libraries and Archives. (Master's thesis, University of North Carolina) (2009) 

    Authors: Gregory, L. 
    Date: 0 0 2009 
    Type: Dissertation Abstract 
    URL: http://ils.unc.edu/MSpapers/3480.pdf 

    Preservation Risk Management for Web Resources: Virtual Remote Control in Cornell's Project Prism. (2002) 

    Authors: Kenney, A., McGovern, N., Botticelli, P., Entlich, R., Lagoze, C. & Payette, S. 
    Publication: D-Lib Magazine 
    Volume: 8 
    Issue: 1 
    Date: 0 0 2002 
    Type: Article 
    URL: http://www.dlib.org/dlib/january02/kenney/01kenney.html 

    Preserving Access to Government Websites: Development and Practice in the CyberCemetery. (2008) 

    Authors: Hoffman, S. 
    Date: 0 0 2008 
    Publisher: University of North Texas Libraries. 
    Type: Web document 
    URL: http://digital.library.unt.edu/ark:/67531/metadc67623/ 

    Preserving Presidential Library Websites: A Case Study with the Franklin D. Roosevelt Library, Museum and Digital Archives. (2001) 

    Authors: Gupta, A. 
    Publication: San Diego Supercomputer Center Technical Report. 
    Date: 0 0 2001 
    Location: La Jolla, CA. 
    Type: Government publication 
    URL: http://legacy.sdsc.edu/TR/TR-2001-03.pdf 

    Preserving the Fabric of Our Lives: A Survey of Web Preservation Initiatives. (2003) 

    Authors: Day M. 
    Publication: Proceedings of ECDL 2003 
    Pages: 461-472 
    Date: 0 0 2003 
    Type: Conference proceedings 
    URL: http://www.ukoln.ac.uk/metadata/presentations/ecdl2003-day/day-paper.pdf 

    Putting Risk Management into Practice. (1997) 

    Authors: Williams, R., Walker, J. & Dorofee, A. 
    Publication: IEEE Software 
    Volume: 14 
    Issue: 3 
    Pages: 75-82 
    Date: 0 0 1997 
    Type: Article 
    URL: http://scholar.google.com/scholar_url?hl=en&q=ftp://ftp.inf.ufrgs.br/pub/caino/Riscos/s3075sei.pdf&sa=X&scisig=AAGBfm3pBsua4LSwpW6FXpGtY5FvFyHKnQ&oi=scholarr 

    Requirements for Digital Preservation Systems. (2005) 

    Authors: Rosenthal, D., Robertson, T., Lipkis, T., Reich, V. & Morabito, S. 
    Publication: D-Lib Magazine 
    Date: 0 0 2005 
    Type: Article 
    URL: http://www.dlib.org/dlib/november05/rosenthal/11rosenthal.html 

    Research Challenges in Web Archiving, Recorded at the Archiving the Web - New Perspectives Workshop, Video (2009) 

    Authors: Masanes, J. 
    Date: 0 0 2009 
    Location: Sweden 
    Type: Web document 
    URL: http://www.kb.se/aktuellt/video/Archiving-the-Web/ 

    The Significance of Storage in the 'Cost of Risk' of Digital Preservation. (2008) 

    Authors: Wright, R., Addis, M. & Miller, A. 
    Publication:  International Journal of Digital Curation 
    Date: 0 0 2009 
    Volume: 4 
    Issue: 3 
    Pages: 104-122 
    URL: http://www.ijdc.net/index.php/ijdc/article/view/138 

    Summary of WWW Characterizations (1999) 

    Authors: Pitkow, J. 
    Publication: World Wide Web 
    Volume: 2 
    Issue: 1-2 
    Pages: 3-13 
    Date: 0 0 1999 
    Publisher: Kluwer Academic Publishers 
    Location: Hingham, MA, USA 
    Type: Article 
    URL: http://www2.parc.com/istl/groups/uir/publications/items/UIR-1998-19-Pitkow-WebJournal-Summary.pdf 
    DOI: 10.1023/A:1019284202914 

    Threats and outline of the Roadmap. (2009) 

    Authors: Giaretta, D. 
    Publication: Presented at CASPAR Training Day for the Cultural and Scientific Domains. 
    Date: 0 0 2009 
    Type: Web document 
    URL: http://www.parse-insight.eu/downloads/PARSEInsight_event200909_roadmap.pdf 

    Towards Web-scale Web Archaeology. (2001) 

    Authors: Leung, S., Perl, S., Stata, R. & Wiener, J. 
    Date: 0 0 2001 
    Publisher: Compaq Systems Research Center 
    Location: Palo Alto 
    Type: Web document 
    URL: http://www.hpl.hp.com/techreports/Compaq-DEC/SRC-RR-174.pdf 

    Transparent Format Migration of Preserved Web Content. (2005) 

    Authors: Rosenthal, D., Robertson, T., Lipkis, T. & Morabito, S. 
    Publication: D-Lib Magazine 
    Volume: 11 
    Issue: 1 
    Date: 0 0 2005 
    Type: Article 
    URL: http://www.dlib.org/dlib/january05/rosenthal/01rosenthal.html 

    Trustworthy Repositories Audit & Certification: Criteria and Checklist. Version 1.0. (2007) 

    Authors: OCLC & CRL 
    Date: 0 0 2007 
    Type: Web document 
    URL: http://www.crl.edu/sites/default/files/attachments/pages/trac_0.pdf 

    The Two Cultures: Mashing up Web 2.0 and the Semantic Web. (2007) 

    Authors: Ankolekar, A., Krotzsch, M., Tran, T. & Vrabdecic, D. 
    Publication: Proceedings of the Sixteenth International World Wide Web Conference 
    Pages: 825-834 
    Date: 0 0 2007 
    Location: Alberta, Canada 
    Type: Conference proceedings 
    URL: http://www2007.org/papers/paper777.pdf 

    Virtual Remote Control: Building a Preservation Risk Management Toolbox for Web Resources. (2004) 

    Authors: Kenney, A., McGovern, N., Entlich, R., Kehoe, W. & Buckley, E. 
    Publication: D-Lib Magazine 
    Date: 0 0 2004 
    Type: Article 
    URL: http://www.dlib.org/dlib/april04/mcgovern/04mcgovern.html 

    The Weakest Link: Managing Risk Through Interdependent Strategies. (2009) 

    Authors: Kunreuther, H., Kleindorfer, P. & Wind, Y. 
    Publication: The Network Challenge: Strategy, Profit and Risk in an Interlinked World. 
    Date: 0 0 2009 
    Publisher: Wharton School Publishing. 
    Location: Upper Saddle River 
    Type: Article 
    URL: http://opim.wharton.upenn.edu/risk/library/C2009_HK_NetworkChallenge_ch22.pdf 

    Web Object Retrieval (2007) 

    Authors: Wen, J., Ma, W., Nie, Z., Ma, Y. & Shi, S. 
    Publication: Proceedings of the 16th international conference on World Wide Web 
    Pages: 81-90 
    Date: 5 0 2007 
    Publisher: ACM 
    Location: New York, NY 
    Type: Conference proceedings 
    URL: http://research.microsoft.com/en-us/um/people/znie/fp626-nie.pdf 
    DOI: 10.1145/1242572.1242584 

    Web Science: An Interdisciplinary Approach to Understanding the Web. (2008) 

    Authors: Hendler, J., Shadbolt, N., Hall, W., Berners-Lee, T. & Weitzner, D. 
    Publication: Communications of the ACM 
    Volume: 51 
    Issue: 7 
    Date: 0 0 2008 
    Type: Journal 
    URL: http://cacm.acm.org/magazines/2008/7/5366-web-science/fulltext 

    Web site archiving - an approach to recording every materially different response produced by a website. (2003) 

    Authors: Fitch, K. 
    Publication: Presentation at Ausweb 03. the Ninth Australian World Wide Web Conference, Hyatt Sanctuary Cove, Gold Coast. 
    Date: 0 0 2003 
    Type: Web document 
    URL: http://ausweb.scu.edu.au/aw03/papers/fitch/paper.html 

    Web Spam: a Survey with Vision for the Archivist (2008) 

    Authors: Benczur, A., Siklosi, D., Szabo, J., Biro, I., Fekete, Z., Kurucz, M., Pereszlenyi, A., Racz, S. & Szabo, A. 
    Publication: 8th International Web Archiving Workshop 
    Date: 0 0 2008 
    Location: Aaarhus, Denmark 
    Type: Conference proceedings 
    URL: http://iwaw.net/08/IWAW2008-Benczur.pdf 

    Webpage Understanding: Beyond Page-level Search (2008) 

    Authors: Wen, J., Ma, W. & Nie, Z. 
    Publication: SIGMOD Record 
    Volume: 37 
    Issue: 4 
    Pages: 48-54 
    Date: 12 0 2008 
    Publisher: ACM 
    Location: New York, NY 
    Type: Article 
    URL: http://www.sigmod.org/publications/sigmod-record/0812/p048.special.nie.pdf/at_download/file 
    DOI: 10.1145/1519103.1519111 

    Why Websites are Lost (and How They're Sometimes Found). (2008) 

    Authors: McCown, F., Marshall, C. & Nelson, M. 
    Publication: Preprint for Communications of the ACM 
    Date: 0 0 2008 
    Type: Journal 
    URL: http://www.harding.edu/fmccown/pubs/lost-website-survey-cacm-all-in-one.pdf 

    Workshop B: Archiving the Web. Presentation handout. Workshop: Archiving the Web. DCC. (2006) 

    Authors: Day M. & Pennock, M. 
    Date: 0 0 2006 
    Type: Web document 
    URL: http://www.ukoln.ac.uk/preservation/presentations/2006/ark/web-archiving-handout.pdf