Bibliography

2001 – 2009

Analysing the Impact of File Formats on Data Integrity. (2008)

Authors: Heydegger, V.
Publication: Proceedings of Archiving 2008
Pages: 50-55
Date: 0 0 2008
Location: Bern, Switzerland
Type: Conference proceedings
Archived URL: https://web.archive.org/web/20111112183231/http://old.hki.uni-koeln.de/people/herrmann/forschung/heydegger_archiving2008_40.pdf

Archival preservation of web resources: HTML to XHTML Migration Test Technical Considerations, Evaluation, and Recommendations. (2001)

Authors: Dollar Consulting
Date: 0 0 2001
Publisher: Smithsonian Institution Archives
Type: Web document
URL: http://siarchives.si.edu/pdf/dollarrpt2.pdf

Archiving the Czech Web: Issues and Challenges. (2003)

Authors: Zabicka, P.
Publication: Presented at the 3rd ECDL Workshop on Web Archives.
Date: 0 0 2003
Location: Trondheim, Norway
Type: Conference proceedings
URL: http://bibnum.bnf.fr/ecdl/2003/proceedings.php?f=zabicka

Archiving the Deep Web. (2002)

Authors: Masanes, J.
Publication: 2nd ECDL Workshop on Web Archiving.
Date: 0 0 2002
Location: Rome, Italy.
Type: Conference proceedings
URL: http://bibnum.bnf.fr/ecdl/2002/BnF/BnF.html

Archiving the World Wide Web, Building a National Strategy for Digital Preservation: Issues in Digital Media Publishing. (2002)

Authors: Lyman, P.
Date: 0 0 2002
Publisher: Council on Library and Information Resources and the Library of Congress.
Type: Web document
URL: http://www.clir.org/pubs/reports/pub106/web.html

The Availability and Persistence of Web References in D-Lib Magazine. (2005)

Authors: McCown, F.
Publication: 5th International Web Archiving Workshop (IWAW05)
Date: 0 0 2005
Location: Vienna
Type: Conference proceedings
URL: http://iwaw.europarchive.org/05/papers/iwaw05-mccown1.pdf

Block-level Link Analysis (2004)

Authors: Wen, J., Ma, W., Cai, D. & He, X.
Publication: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Pages: 440-447
Date: 7 25 2004
Location: Sheffield, United Kingdom
Type: Conference proceedings
Archived URL: http://web.archive.org/web/20130810075517/http://research.microsoft.com/en-us/um/people/jrwen/jrwen_files/publications/block-level%20link%20analysis.pdf
DOI: 10.1145/1008992.1009068

Catch me if you can: Visual Analysis of Coherence Defects in Web Archiving. (2009)

Authors: Spaniol, M., Mazeika, A., Denev, D. & Weikum, G.
Publication: Proceedings of 9th International Web Archiving Workshop
Pages: 27-37
Date: 0 0 2009
Location: Corfu
Type: Conference proceedings
URL: http://www.iwaw.net/09/IWAW2009.pdf

Collecting and preserving the World Wide Web: A feasibility study undertaken for the JISC and Wellcome Trust (2003)

Authors: Day M.
Date: 0 0 2003
Publisher: Wellcome Library
Type: Web document
URL: http://www.jisc.ac.uk/uploaded_documents/archiving_feasibility.pdf

Considerations for the Preservation of Blogs, Digital Preservation Europe briefing paper. (2009)

Authors: Hank, C., Sheble, L. & Choemprayong, S.
Date: 0 0 2009
Type: Web document
URL: http://www.digitalpreservationeurope.eu/publications/briefs/preservartion_blogs.pdf

The Continuing Metamorphosis of the Web (2009)

Authors: Spector, A. Z.
Publication: Keynote at WWW2009
Date: 4 24 2009
Publisher: Google, Inc.
Location: Madrid
Type: Conference proceedings
URL: http://www2009.eprints.org/214/1/www2009azsv4FinalV3.pdf

Data Management Projects at Google (2008)

Authors: Cafarella, M., Chang, E., Fikes, A., Halevy, A., Hsieh, W., Lerner, A., Madhavan, J. &Muthukrishnan, S.
Publication: SIGMOD Record
Volume: 37
Issue: 1
Pages: 34-38
Date: 0 0 2008
Type: Journal
URL: http://turing.cs.washington.edu/papers/dataprojects-google-sigmodrecord08.pdf

Data Quality in Web Archiving. (2009)

Authors: Spaniol, M., Mazeika, A., Denev, D., Weikum, G. & Senellart, P.
Publication: Proceedings of WICOW
Pages: 19 – 26
Date: 0 0 2009
Publisher: ACM Press
Type: Conference proceedings
URL: http://liwa-project.eu/images/publications/p19-spaniolA.pdf

Digital Preservation Strategy (2006)

Authors: British Library
Date: 0 0 2006
Type: Web document
Archived URL:http://web.archive.org/web/20120416221359/http://www.bl.uk/aboutus/stratpolprog/ccare/introduction/digital/digpresstrat.pdf

The Discoverability of the Web. (2007)

Authors: Dasgupta, A., Ghosh, A., Kumar, R., Olston, C., Pandey, S. & Tomkins, A.
Publication: Proceedings of the Sixteenth International World Wide Web Conference.
Pages: 421-430
Date: 0 0 2007
Location: Alberta, Canada
Type: Conference proceedings
URL: http://www2007.org/papers/paper592.pdf

Harvesting the Swedish web space (2001)

Authors: Arvidson, A.
Publication: Preserving online content for future generation: ECDL Workshop
Date: 0 0 2001
Location: Darmstadt, Germany
Type: Conference proceedings
URL: http://bibnum.bnf.fr/ecdl/2001/sweden/sld001.htm

IRIS Research: VRC Risk Management Resources. (2002)

Authors: Kenney, A. & McGovern, N.
Date: 0 0 2002
Publisher: Cornell University Library
Type: Web page
Archived URL:http://web.archive.org/web/20100823132834/http://irisresearch.library.cornell.edu/VRC/riskresources.html

Learning Block Importance Models for Web Pages (2004)

Authors: Song, R., Liu, H., Wen, J. & Ma, W.
Publication: Proc. 13th World Wide Web Conference
Date: 0 0 2004
Location: New York
Type: Conference proceedings
Archived URL: http://web.archive.org/web/20090220074359/http://research.microsoft.com/en-us/um/people/jrwen/jrwen_files/publications/BlockImportance.PDF

Legal issues relating to the archiving of Internet resources in the UK, EU, USA and Australia. (2003)

Authors: Charlesworth, A.
Publication: Version 1.0. Study for JISC and Wellcome Trust.
Date: 0 0 2003
Type: Article
URL: http://www.jisc.ac.uk/uploaded_documents/archiving_legal.pdf

The Long-Term Preservation of Web Content. (2006)

Authors: Day M. & Masanes, J.
Publication: Web Archiving
Pages: 177-199
Date: 0 0 2006
Publisher: Springer-Verlag
Location: Berlin
Type: Book

Managing duplicates in a web archive. (2006)

Authors: Gomes, D., Santos, A. & Silva, M.
Publication: In Liebrock, L. (Ed.), Proceedings of the 21th Annual ACM Symposium on Applied Computing (ACM-SAC-06), Dijon, France.
Date: 0 0 2006
Type: Conference proceedings
URL: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.64.4047&rep=rep1&type=pdf

Modeling Object Characteristics of Dynamic Web Content (2003)

Authors: Shu, W., Collins, E. & Karamcheti, V.
Publication: Journal of Parallel and Distributed Computing
Volume: 63
Issue: 10
Pages: 963-980
Date: 10 0 2003
Type: Article
URL: http://www.cs.wayne.edu/~weisong/papers/jpdc03.pdf

NutchWAX Multilingualization

Authors: ASAHARA , Masayuki (in collaboration with National Diet Library, Japan)
Type: Web page
URL:
NutchWAX-0.12.9 Japanization
https://sites.google.com/site/masayua/m/nutch/nutchwax/nutchwax-0129-ja2
NutchWAX-0.12.9 Chinezation
https://sites.google.com/site/masayua/m/nutch/nutchwax/nutchwax-0129-zh2
NutchWAX-0.12.9 Koreanization
https://sites.google.com/site/masayua/m/nutch/nutchwax/nutchwax-0129-kr2

The POWR Handbook: Preservation of Web Resources Handbook. (2008)

Authors: JISC
Date: 0 0 2008
Publisher: University of London Computing Center.
Type: Article
URL: http://jiscpowr.jiscinvolve.org/files/2008/11/powrhandbookv1.pdf

The Practice and perception of Web Archiving in Academic Libraries and Archives. (Master’s thesis, University of North Carolina) (2009)

Authors: Gregory, L.
Date: 0 0 2009
Type: Dissertation Abstract
URL: http://ils.unc.edu/MSpapers/3480.pdf

Preservation Risk Management for Web Resources: Virtual Remote Control in Cornell’s Project Prism. (2002)

Authors: Kenney, A., McGovern, N., Botticelli, P., Entlich, R., Lagoze, C. & Payette, S.
Publication: D-Lib Magazine
Volume: 8
Issue: 1
Date: 0 0 2002
Type: Article
URL: http://www.dlib.org/dlib/january02/kenney/01kenney.html

Preserving Access to Government Websites: Development and Practice in theCyberCemetery. (2008)

Authors: Hoffman, S.
Date: 0 0 2008
Publisher: University of North Texas Libraries.
Type: Web document
URL: http://digital.library.unt.edu/ark:/67531/metadc67623/

Preserving Presidential Library Websites: A Case Study with the Franklin D. Roosevelt Library, Museum and Digital Archives. (2001)

Authors: Gupta, A.
Publication: San Diego Supercomputer Center Technical Report.
Date: 0 0 2001
Location: La Jolla, CA.
Type: Government publication
URL: http://legacy.sdsc.edu/TR/TR-2001-03.pdf

Preserving the Fabric of Our Lives: A Survey of Web Preservation Initiatives. (2003)

Authors: Day M.
Publication: Proceedings of ECDL 2003
Pages: 461-472
Date: 0 0 2003
Type: Conference proceedings
URL: http://www.ukoln.ac.uk/metadata/presentations/ecdl2003-day/day-paper.pdf

Putting Risk Management into Practice. (1997)

Authors: Williams, R., Walker, J. & Dorofee, A.
Publication: IEEE Software
Volume: 14
Issue: 3
Pages: 75-82
Date: 0 0 1997
Type: Article
URL: http://scholar.google.com/scholar_url?hl=en&q=ftp://ftp.inf.ufrgs.br/pub/caino/Riscos/s3075sei.pdf&sa=X&scisig=AAGBfm3pBsua4LSwpW6FXpGtY5FvFyHKnQ&oi=scholarr

Requirements for Digital Preservation Systems. (2005)

Authors: Rosenthal, D., Robertson, T., Lipkis, T., Reich, V. & Morabito, S.
Publication: D-Lib Magazine
Date: 0 0 2005
Type: Article
URL: http://www.dlib.org/dlib/november05/rosenthal/11rosenthal.html

Research Challenges in Web Archiving, Recorded at the Archiving the Web – New Perspectives Workshop, Video (2009)

Authors: Masanes, J.
Date: 0 0 2009
Location: Sweden
Type: Web document
URL: http://www.kb.se/aktuellt/video/Archiving-the-Web/

The Significance of Storage in the ‘Cost of Risk’ of Digital Preservation. (2008)

Authors: Wright, R., Addis, M. & Miller, A.
Publication:  International Journal of Digital Curation
Date: 0 0 2009
Volume: 4
Issue: 3
Pages: 104-122
URL: http://www.ijdc.net/index.php/ijdc/article/view/138

Summary of WWW Characterizations (1999)

Authors: Pitkow, J.
Publication: World Wide Web
Volume: 2
Issue: 1-2
Pages: 3-13
Date: 0 0 1999
Publisher: Kluwer Academic Publishers
Location: Hingham, MA, USA
Type: Article
URL: http://www2.parc.com/istl/groups/uir/publications/items/UIR-1998-19-Pitkow-WebJournal-Summary.pdf
DOI: 10.1023/A:1019284202914

Threats and outline of the Roadmap. (2009)

Authors: Giaretta, D.
Publication: Presented at CASPAR Training Day for the Cultural and Scientific Domains.
Date: 0 0 2009
Type: Web document
URL: http://www.parse-insight.eu/downloads/PARSEInsight_event200909_roadmap.pdf

Towards Web-scale Web Archaeology. (2001)

Authors: Leung, S., Perl, S., Stata, R. & Wiener, J.
Date: 0 0 2001
Publisher: Compaq Systems Research Center
Location: Palo Alto
Type: Web document
URL: http://www.hpl.hp.com/techreports/Compaq-DEC/SRC-RR-174.pdf

Transparent Format Migration of Preserved Web Content. (2005)

Authors: Rosenthal, D., Robertson, T., Lipkis, T. & Morabito, S.
Publication: D-Lib Magazine
Volume: 11
Issue: 1
Date: 0 0 2005
Type: Article
URL: http://www.dlib.org/dlib/january05/rosenthal/01rosenthal.html

Trustworthy Repositories Audit & Certification: Criteria and Checklist. Version 1.0. (2007)

Authors: OCLC & CRL
Date: 0 0 2007
Type: Web document
URL: http://www.crl.edu/sites/default/files/attachments/pages/trac_0.pdf

The Two Cultures: Mashing up Web 2.0 and the Semantic Web. (2007)

Authors: Ankolekar, A., Krotzsch, M., Tran, T. & Vrabdecic, D.
Publication: Proceedings of the Sixteenth International World Wide Web Conference
Pages: 825-834
Date: 0 0 2007
Location: Alberta, Canada
Type: Conference proceedings
URL: http://www2007.org/papers/paper777.pdf

Virtual Remote Control: Building a Preservation Risk Management Toolbox for Web Resources. (2004)

Authors: Kenney, A., McGovern, N., Entlich, R., Kehoe, W. & Buckley, E.
Publication: D-Lib Magazine
Date: 0 0 2004
Type: Article
URL: http://www.dlib.org/dlib/april04/mcgovern/04mcgovern.html

The Weakest Link: Managing Risk Through Interdependent Strategies. (2009)

Authors: Kunreuther, H., Kleindorfer, P. & Wind, Y.
Publication: The Network Challenge: Strategy, Profit and Risk in an Interlinked World.
Date: 0 0 2009
Publisher: Wharton School Publishing.
Location: Upper Saddle River
Type: Article
URL: http://opim.wharton.upenn.edu/risk/library/C2009_HK_NetworkChallenge_ch22.pdf

Web Object Retrieval (2007)

Authors: Wen, J., Ma, W., Nie, Z., Ma, Y. & Shi, S.
Publication: Proceedings of the 16th international conference on World Wide Web
Pages: 81-90
Date: 5 0 2007
Publisher: ACM
Location: New York, NY
Type: Conference proceedings
URL: http://research.microsoft.com/en-us/um/people/znie/fp626-nie.pdf
DOI: 10.1145/1242572.1242584

Web Science: An Interdisciplinary Approach to Understanding the Web. (2008)

Authors: Hendler, J., Shadbolt, N., Hall, W., Berners-Lee, T. & Weitzner, D.
Publication: Communications of the ACM
Volume: 51
Issue: 7
Date: 0 0 2008
Type: Journal
URL: http://cacm.acm.org/magazines/2008/7/5366-web-science/fulltext

Web site archiving – an approach to recording every materially different response produced by a website. (2003)

Authors: Fitch, K.
Publication: Presentation at Ausweb 03. the Ninth Australian World Wide Web Conference, Hyatt Sanctuary Cove, Gold Coast.
Date: 0 0 2003
Type: Web document
URL: http://ausweb.scu.edu.au/aw03/papers/fitch/paper.html

Web Spam: a Survey with Vision for the Archivist (2008)

Authors: Benczur, A., Siklosi, D., Szabo, J., Biro, I., Fekete, Z., Kurucz, M., Pereszlenyi, A., Racz, S. & Szabo, A.
Publication: 8th International Web Archiving Workshop
Date: 0 0 2008
Location: Aaarhus, Denmark
Type: Conference proceedings
URL: http://iwaw.net/08/IWAW2008-Benczur.pdf

Webpage Understanding: Beyond Page-level Search (2008)

Authors: Wen, J., Ma, W. & Nie, Z.
Publication: SIGMOD Record
Volume: 37
Issue: 4
Pages: 48-54
Date: 12 0 2008
Publisher: ACM
Location: New York, NY
Type: Article
URL: http://www.sigmod.org/publications/sigmod-record/0812/p048.special.nie.pdf/at_download/file
DOI: 10.1145/1519103.1519111

Why Websites are Lost (and How They’re Sometimes Found). (2008)

Authors: McCown, F., Marshall, C. & Nelson, M.
Publication: Preprint for Communications of the ACM
Date: 0 0 2008
Type: Journal
URL: http://www.harding.edu/fmccown/pubs/lost-website-survey-cacm-all-in-one.pdf

Workshop B: Archiving the Web. Presentation handout. Workshop: Archiving the Web. DCC. (2006)

Authors: Day M. & Pennock, M.
Date: 0 0 2006
Type: Web document
URL: http://www.ukoln.ac.uk/preservation/presentations/2006/ark/web-archiving-handout.pdf