SlideShare a Scribd company logo
Digital Records and  Digital Archives Preservation in theory and practice Richard Davis Digital Archives Department University of London Computer Centre http://www.ulcc.ac.uk http://ndad.ulcc.ac.uk [email_address]
What we will cover Preservation options Basic tasks Physical preservation Logical preservation Metadata Organisational issues
Why do they need special attention? Digital records require an intermediary They don’t have a fixed form Their carriers are perishable They fall outside records management regimes Computers need “experts”
What are the advantages of digital preservation? Ease of copying Ease of re-use No worry about which is “original” Takes up less space Easy to search, even without a catalogue
Assumptions You know what digital records exist You know what you want to preserve You have a retention/disposal policy You can separate material for preservation You know what you want to do with it
Preservation options Preserve the bits Preserve the data Preserve the record Preserve the experience
Preserving the bits Keep the data in exactly the same format Interpretation likely to be a problem Works in some contexts Mostly useful as an adjunct to other strategies
Preserving the data Keep the essential data in generic form Don’t worry about presentation and context Better than nothing Often used for databases Reduces long-term utility
Preserving the record Keep the information and context The ideal approach Don’t necessarily preserve appearance Balances utility against costs
Preserving the experience Keep everything - software, information, forms, etc May require emulators or old computers Expensive Doesn’t support/promote re-use Someone may do it - but not us!
Basic tasks in digital preservation Protecting the media Copying to new media Choosing a file format Migrating to new file formats Managing metadata
Physical forms Floppy disks Open-reel tapes Tape cartridges Hard disks CD-ROM ZIP, JAZ, etc. disks Punched cards, paper tape
Media lifespans
Refreshing media The process of copying to new media At end of predicted lifetime At regular intervals After detected failure Lifetime may be number of uses, not interval Maybe the same, maybe different Check all copies
Logical preservation Selecting right file format At time of creation or accession No universal solution Preservation format may be different from access format Should include metadata
Properties of preservation formats Published standard Stability Good conversion from ingest formats Good conversion to access formats Good representation of structure of information
Long-term storage Documents: Plain text, PDF, XML Data tables (DBMS or spreadsheet):  CSV, SQL Schema, XML Schema Pictures: TIFF Sound: PCM, AIFF Avoid lossy compression
Capturing the record Manual Users must choose what is retained User-driven conversion  Automatic System forces capture of record copy Triggers conversion to preservation format Retrospective archiving is manual, by definition ERMS should support automated capture
Automated capture Email: central server captures and indexes Documents: EDMS Databases: capture transaction logs and/or regular snapshots Web sites: as databases Custom applications: specify requirements  ab initio
Migration Frequency is not predictable Usually driven by external factors Changes in IS/IT strategy Software/hardware upgrades Should be automated Check migration does not lose information
Metadata Data about data Not specific to digital records Types of metadata: Discovery Access Preservation System Embedded or external Treat with same care as data itself!
Typical metadata Author Subject Keywords Abstract Dates of creation/use/retirement Access conditions Retention period
Non-digital metadata Most computer systems depend on paper records to be understood: Specifications Manuals Reports Some essential information may only be in people’s heads Especially true for older systems/records
Non-digital metadata
Preservation and access Preservation systems: Keep information safe and secure Control accessibility Deliver data without interpretation Access systems: Mediate between user and preservation system Format, select and present information Enable user discovery of resources Relate information to context
Working with IT departments Style of IT support depends on size/age/type of organisation Central control is easier to work with Try to be involved before records are created Express needs/issues in clear, real-world terms IT developers like simple, reusable formats as well
Hints and tips Databases: may have different views Beware of … Password-protected files Automated dates in documents Dynamic documents Linked documents Embedded objects Hybrid assemblies
Hybrid assemblies and embedded objects
And finally… For now: preserve original bitwise copies  and  use standard formats Don’t wait for all the answers before you begin Make friends with IT specialists Learn about other initiatives and approaches Remember your Records Management training: digital isn’t that different

More Related Content

PPT
PPT
Organization of Archival Materials
PPTX
Archival Processing And Description
PPTX
Archival Arrangement, Description & Access
PPTX
Preservation and archiving unit 1
DOC
A manual for a small archives
PPTX
Introduction to arrangement and description (feb 4&5, 2012)
PPT
Preservation Strategies For Library And Archival Resources
Organization of Archival Materials
Archival Processing And Description
Archival Arrangement, Description & Access
Preservation and archiving unit 1
A manual for a small archives
Introduction to arrangement and description (feb 4&5, 2012)
Preservation Strategies For Library And Archival Resources

What's hot (20)

PPT
Conservation and preservation of archival materials and manuscripts 1
PPT
Brief Introduction to Digital Preservation
PPTX
Spiral of Scientific Method Arun Joseph MPhil
PPTX
Introduction to DSpace
PDF
Digital Preservation Standards
PPTX
Digital archiving
PDF
Archival Acquisition (LIS 170)
PPT
Appraisal
PPT
Archival Management: Principles and Techniques
PPT
Archival practice and records management
PPTX
Overview of Archival Processing
PDF
Indexing language concept types and characteristics
PDF
Special Library Management
PPTX
Subject cataloging
PPT
Bibliographic coupling
PPTX
Circulation control presentation new
PPT
RDA (Resource Description & Access)
PPT
Cataloguing
Conservation and preservation of archival materials and manuscripts 1
Brief Introduction to Digital Preservation
Spiral of Scientific Method Arun Joseph MPhil
Introduction to DSpace
Digital Preservation Standards
Digital archiving
Archival Acquisition (LIS 170)
Appraisal
Archival Management: Principles and Techniques
Archival practice and records management
Overview of Archival Processing
Indexing language concept types and characteristics
Special Library Management
Subject cataloging
Bibliographic coupling
Circulation control presentation new
RDA (Resource Description & Access)
Cataloguing
Ad

Similar to Digital Archives in Theory and Practice (20)

PPT
20110428 ARMA Amarillo Managing Your Records in 5, 50, 500 Years
PPT
Allison Stanfield
PPTX
Completepresentation
PPTX
Digital Presentation Best Practices: Lessons Learned From Across the Pond
PPTX
Digital Preservation Best Practices: Lessons Learned From Across the Pond
PDF
WHAT IS DIGITAL PRESERVATION? DISCUSS ITS SIGNIFICANCE IN TODAY’S INFORMATIO...
PPT
The digital preservation technical context
PPT
D.3.1: State of the Art - Linked Data and Digital Preservation
PPT
Getaneh Alemu
PPTX
The Mind-Boggling Challege of Long-Term Digital Preservation
PPT
Introduction to Digital Preservation
PPT
Digital Destiny
PPTX
Jisc Research Data Management Shared Service Workshop: An institutional persp...
PPT
Andrew Waugh presentation
PPT
Getting started in digital preservation
PDF
Andrew Waugh presentation
PPT
Preservation and Access: Achieving the Best of Both Worlds
20110428 ARMA Amarillo Managing Your Records in 5, 50, 500 Years
Allison Stanfield
Completepresentation
Digital Presentation Best Practices: Lessons Learned From Across the Pond
Digital Preservation Best Practices: Lessons Learned From Across the Pond
WHAT IS DIGITAL PRESERVATION? DISCUSS ITS SIGNIFICANCE IN TODAY’S INFORMATIO...
The digital preservation technical context
D.3.1: State of the Art - Linked Data and Digital Preservation
Getaneh Alemu
The Mind-Boggling Challege of Long-Term Digital Preservation
Introduction to Digital Preservation
Digital Destiny
Jisc Research Data Management Shared Service Workshop: An institutional persp...
Andrew Waugh presentation
Getting started in digital preservation
Andrew Waugh presentation
Preservation and Access: Achieving the Best of Both Worlds
Ad

More from Richard Davis (16)

PDF
Roll your-own e-books ... what's not to love?
PDF
JISC Anthologizr project
PDF
Enhancing Linnean Online presentation March 2012
PDF
Changing Platforms
PDF
Research communications - Slides for discussion
PDF
Beyond SNEEP: Ideas for Creative Repository Management
PPT
FOTE2009 Integrating VLEs And Repositories
PDF
1001 Things To Do With A Live Repository
PDF
Practical Blog Preservation (Workshop)
PDF
Social Networking Extensions for EPrints
PDF
PRIMO - Practice-as-Research In Music Online
PDF
ArchivePress Presentation (BL 21/7/2009)
PPT
Everything You Always Wanted To Know About XML But Were Afraid To Ask
PDF
Significant properties of e-learning objects (SPeLOs)
PDF
On the margins of scholarship
PPT
Bluffer's Guide to Institutional Repositories
Roll your-own e-books ... what's not to love?
JISC Anthologizr project
Enhancing Linnean Online presentation March 2012
Changing Platforms
Research communications - Slides for discussion
Beyond SNEEP: Ideas for Creative Repository Management
FOTE2009 Integrating VLEs And Repositories
1001 Things To Do With A Live Repository
Practical Blog Preservation (Workshop)
Social Networking Extensions for EPrints
PRIMO - Practice-as-Research In Music Online
ArchivePress Presentation (BL 21/7/2009)
Everything You Always Wanted To Know About XML But Were Afraid To Ask
Significant properties of e-learning objects (SPeLOs)
On the margins of scholarship
Bluffer's Guide to Institutional Repositories

Recently uploaded (20)

PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
Big Data Technologies - Introduction.pptx
PPTX
sap open course for s4hana steps from ECC to s4
PDF
Electronic commerce courselecture one. Pdf
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Encapsulation_ Review paper, used for researhc scholars
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Approach and Philosophy of On baking technology
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Encapsulation theory and applications.pdf
PDF
Machine learning based COVID-19 study performance prediction
Review of recent advances in non-invasive hemoglobin estimation
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
MYSQL Presentation for SQL database connectivity
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Chapter 3 Spatial Domain Image Processing.pdf
Big Data Technologies - Introduction.pptx
sap open course for s4hana steps from ECC to s4
Electronic commerce courselecture one. Pdf
Advanced methodologies resolving dimensionality complications for autism neur...
Encapsulation_ Review paper, used for researhc scholars
Understanding_Digital_Forensics_Presentation.pptx
Network Security Unit 5.pdf for BCA BBA.
Reach Out and Touch Someone: Haptics and Empathic Computing
20250228 LYD VKU AI Blended-Learning.pptx
Approach and Philosophy of On baking technology
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
The Rise and Fall of 3GPP – Time for a Sabbatical?
Encapsulation theory and applications.pdf
Machine learning based COVID-19 study performance prediction

Digital Archives in Theory and Practice

  • 1. Digital Records and Digital Archives Preservation in theory and practice Richard Davis Digital Archives Department University of London Computer Centre http://www.ulcc.ac.uk http://ndad.ulcc.ac.uk [email_address]
  • 2. What we will cover Preservation options Basic tasks Physical preservation Logical preservation Metadata Organisational issues
  • 3. Why do they need special attention? Digital records require an intermediary They don’t have a fixed form Their carriers are perishable They fall outside records management regimes Computers need “experts”
  • 4. What are the advantages of digital preservation? Ease of copying Ease of re-use No worry about which is “original” Takes up less space Easy to search, even without a catalogue
  • 5. Assumptions You know what digital records exist You know what you want to preserve You have a retention/disposal policy You can separate material for preservation You know what you want to do with it
  • 6. Preservation options Preserve the bits Preserve the data Preserve the record Preserve the experience
  • 7. Preserving the bits Keep the data in exactly the same format Interpretation likely to be a problem Works in some contexts Mostly useful as an adjunct to other strategies
  • 8. Preserving the data Keep the essential data in generic form Don’t worry about presentation and context Better than nothing Often used for databases Reduces long-term utility
  • 9. Preserving the record Keep the information and context The ideal approach Don’t necessarily preserve appearance Balances utility against costs
  • 10. Preserving the experience Keep everything - software, information, forms, etc May require emulators or old computers Expensive Doesn’t support/promote re-use Someone may do it - but not us!
  • 11. Basic tasks in digital preservation Protecting the media Copying to new media Choosing a file format Migrating to new file formats Managing metadata
  • 12. Physical forms Floppy disks Open-reel tapes Tape cartridges Hard disks CD-ROM ZIP, JAZ, etc. disks Punched cards, paper tape
  • 14. Refreshing media The process of copying to new media At end of predicted lifetime At regular intervals After detected failure Lifetime may be number of uses, not interval Maybe the same, maybe different Check all copies
  • 15. Logical preservation Selecting right file format At time of creation or accession No universal solution Preservation format may be different from access format Should include metadata
  • 16. Properties of preservation formats Published standard Stability Good conversion from ingest formats Good conversion to access formats Good representation of structure of information
  • 17. Long-term storage Documents: Plain text, PDF, XML Data tables (DBMS or spreadsheet): CSV, SQL Schema, XML Schema Pictures: TIFF Sound: PCM, AIFF Avoid lossy compression
  • 18. Capturing the record Manual Users must choose what is retained User-driven conversion Automatic System forces capture of record copy Triggers conversion to preservation format Retrospective archiving is manual, by definition ERMS should support automated capture
  • 19. Automated capture Email: central server captures and indexes Documents: EDMS Databases: capture transaction logs and/or regular snapshots Web sites: as databases Custom applications: specify requirements ab initio
  • 20. Migration Frequency is not predictable Usually driven by external factors Changes in IS/IT strategy Software/hardware upgrades Should be automated Check migration does not lose information
  • 21. Metadata Data about data Not specific to digital records Types of metadata: Discovery Access Preservation System Embedded or external Treat with same care as data itself!
  • 22. Typical metadata Author Subject Keywords Abstract Dates of creation/use/retirement Access conditions Retention period
  • 23. Non-digital metadata Most computer systems depend on paper records to be understood: Specifications Manuals Reports Some essential information may only be in people’s heads Especially true for older systems/records
  • 25. Preservation and access Preservation systems: Keep information safe and secure Control accessibility Deliver data without interpretation Access systems: Mediate between user and preservation system Format, select and present information Enable user discovery of resources Relate information to context
  • 26. Working with IT departments Style of IT support depends on size/age/type of organisation Central control is easier to work with Try to be involved before records are created Express needs/issues in clear, real-world terms IT developers like simple, reusable formats as well
  • 27. Hints and tips Databases: may have different views Beware of … Password-protected files Automated dates in documents Dynamic documents Linked documents Embedded objects Hybrid assemblies
  • 28. Hybrid assemblies and embedded objects
  • 29. And finally… For now: preserve original bitwise copies and use standard formats Don’t wait for all the answers before you begin Make friends with IT specialists Learn about other initiatives and approaches Remember your Records Management training: digital isn’t that different