SlideShare a Scribd company logo
Invited Demo: Prometheus: Managing the Ingest of Media
                        Carriers
           Nicholas del Pozo                                     Douglas Elford                                   David Pearson
          Digital Preservation                               Digital Preservation                              Digital Preservation
      National Library of Australia                      National Library of Australia                     National Library of Australia
   Parkes Place, ACT 2600 Australia                   Parkes Place, ACT 2600 Australia                  Parkes Place, ACT 2600 Australia
        ndelpozo@nla.gov.au                                  delford@nla.gov.au                             dapearso@nla.gov.au

ABSTRACT                                                                         number of widely used carrier types, any long-term solution has
The National Library of Australia has a relatively small but                     to make provision for almost any kind of carrier, including carrier
important collection of digital material stored on common carriers               types which may not have been encountered yet. Moreover, this is
such as floppy disks, CDs and DVDs. This includes both                           a constantly growing problem; if we don’t deal with the digital
published material and unpublished manuscripts in digital form.                  materials that we have already collected, and ideally process new
In the past, preservation of the Library’s physical format digital               materials as a part of the acquisition process, accessing these
collection has been taken care of manually, on a case-by-case                    carriers will soon become unmanageable, and eventually
basis, but this approach is insufficient to deal effectively with the            impossible.
increasing volume of material requiring preservation.                            Factors such as obsolescence and carrier degradation already
The Library has produced an application called Prometheus,                       make it difficult for digital preservation solutions to preserve
which provides a semi-automated, scalable process for                            access to digital content. Additionally, due to the potential
transferring data from carriers to preservation-managed digital                  volume and diversity of carriers and file formats, unless solutions
storage. This is helping the Library to mitigate the major risks                 are robust and semi-automated, the digital data that it is currently
associated with storing the content on physical carriers:                        possible to preserve may not be. To avoid exacerbating the
deterioration of the media and obsolescence of the hardware                      problem, it is key that solutions deal with current common carrier
required to access them. Prometheus makes it easier to process the               types as efficiently as possible, while providing access to, or a
majority of carriers commonly encountered in the Library and to                  mechanism for preserving, as many older carriers as is practical.
collect and manage metadata about their content. Although not
perfect, Prometheus is helping the Library to save digital content               2. PROMETHEUS
before it is too late.                                                           To ensure access to digital content on the most common carriers
                                                                                 within the Library, the Digital Preservation Workflow Project
Keywords                                                                         produced an application called Prometheus. This application
Digital preservation, media carriers, National Library of                        provides a semi-automated, scalable process for transferring data
Australia, obsolescence, open source software, Prometheus.                       from carriers to preservation-managed digital storage. This is
                                                                                 helping the Library to mitigate the major risks associated with
                                                                                 storing the content on physical carriers: deterioration of the media
1. INTRODUCTION                                                                  and obsolescence of the technology required to access them.
The National Library of Australia has a relatively small but                     Prometheus makes it easier to process the majority of carriers
important collection of digital material stored on common carriers               commonly encountered in the Library and to collect and manage
such as floppy disks, CDs and DVDs. This includes both                           metadata about their content. It also provides mechanisms to
published material and unpublished manuscripts in digital form.                  accommodate special cases, such as less common media types.
In the past, preservation of the Library’s physical format digital               Additionally, the original physical arrangement of a group of
collection has been taken care of manually, on a case-by-case                    media can be recorded, even in those cases where a piece of
basis, but this approach is insufficient to deal effectively with the            physical media cannot be processed.
increasing volume of material requiring preservation.
                                                                                 Prometheus allows Library staff to link to catalogue records,
The Library collects digital material through multiple acquisition               create a byte-level image of the digital content, and transfer it to
streams and generally has little control over the physical format in             preservation-managed digital storage. Once the content is copied
which the material arrives. So, while most items fall into a small               from the carrier, the integrity of the image is verified, and as
                                                                                 much metadata as possible is harvested. Attaching a customisable
This work is licensed under the Creative Commons Attribution-
                                                                                 ‘mini-jukebox’ (Figure 1) to a staff member’s workstation allows
Noncommercial-No Derivative Works 3.0 Unported license. You are free
to share this work (copy, distribute and transmit) under the following
                                                                                 the accurate duplication of the content from a wider range of
conditions: attribution, non-commercial, and no derivative works. To view        carrier types, such as USB thumb drives, memory cards or 3½
a copy of this license, visit http://creativecommons.org/licenses/by-nc-         inch floppy disks. It also provides more reliable hardware for
nd/3.0/.                                                                         imaging CDs, and DVDs. The digital preservation section can use
DigCCurr2009, April 1-3, 2009, Chapel Hill, NC, USA                              Prometheus to deal with carrier types that fall outside this range,
                                                                                 such as 5¼ inch floppy disks, SyQuest disks or hard drives.




                                                                            73
Figure 1. Library developer Snezana Mihajlovic uses a
  customised ‘mini-jukebox’ attached to a standard Library
   workstation (Photo: Douglas Elford, National Library).
The system incorporates a range of open source tools to undertake
processing, including carrier imaging (dd [1], cdrdao [2]);
integrity calculation and checking (Jaxsum [3]); file identification
(DROID [4]); and metadata extraction (JHOVE [5], NLNZ
Metadata Extraction Tool [6]). These tools are deployed using
Java-based web services. Moreover, Prometheus has been
designed in a modular way, so that tools and services can be
easily upgraded or replaced as new versions are released or better
software becomes available (Figure 2).

3. THE SOFTWARE RELEASED
Prometheus was designed for the Library’s specific environment,
and therefore is not an ‘out of the box’ solution. However, it may
be possible for other parties to use all or some of the
requirements, other documentation or components. As such, the
software has been released under the GNU General Public
License V3.0.      The latest version of Prometheus and its
documentation is available from the project website [7]. A paper
was presented on this project at the IFLA World Library and
Information Congress in Quebec City, Canada, in August 2008
[8].
If we wait for the prefect system to be built, for the content on
many carriers it will already be too late. Experience to date
suggests that even though we all share the same fundamental                                 Figure 2. General Process View.
problem, the sheer volume and diversity of carriers, as well as
varying individual collecting and business environments, makes it
unlikely that there will ever be a single software solution that can
be used by everyone. At least for the Library, Prometheus                   4. ACKNOWLEDGMENTS
provides a starting point to manage the ingest of, and preserve             Our thanks to Gerard Clifton, Snezana Mihajlovic and Joseph
content from problematic and sometimes idiosyncratic carriers for           Mok, who worked with us on version 1.0 of Prometheus, and who
long-term preservation, hopefully in a way that can advantage               continue with the development work for version 1.4.
others.
This paper is based on the earlier paper, that appeared in                  5. REFERENCES
Gateways Dec 2008 [9].                                                      [1] dd for Windows, at http://www.chrysocome.net/dd
                                                                            [2] cdrdao, at http://cdrdao.sourceforge.net/
                                                                            [3] Jaxsum Java checksum utility, at
                                                                                http://sourceforge.net/projects/jacksum/




                                                                       74
[4] DROID automatic file format identification tool, at              [8] Elford, D., del Pozo, N., Mihajlovic, S., Pearson, D., Clifton,
    http://droid.sourceforge.net/wiki/index.php/Introduction             G. and Webb, C. 2008. Media Matters: developing processes
[5] JHOVE object validation environment, at                              for preserving digital objects on physical carriers at the
    http://hul.harvard.edu/jhove/                                        National Library of Australia. In World Library and
                                                                         Information Congress: 74th IFLA General Conference and
[6] National Library of New Zealand Metadata Extraction Tool,            Council 10-14 August 2008, Québec, Canada
    at http://meta-extractor.sourceforge.net/                            www.ifla.org/IV/ifla74/papers/084-Webb-en.pdf.
[7] Prometheus Sourceforge Website, at http://prometheus-            [9] Pearson, D. 2008. Titans in the Library: Prometheus Unbinds
    digi.sourceforge.net/                                                At-risk Data. In Gateways Dec 2008.
                                                                         http://www.nla.gov.au/pub/gateways/issues/96/story02.htm.




                                                                75

More Related Content

PDF
BHL hardware architecture - storage and clusters
PDF
Integration of Accessible Documents into Digital Libraries of Tomorrow
PDF
Moeller bosc2010 debian_taverna
PPTX
Doing Less More Often: An Approach to Digital Strategy for Cultural Heritage ...
DOCX
Online Assignment - Digital Resources
PDF
M.Sc. Research Proposal
BHL hardware architecture - storage and clusters
Integration of Accessible Documents into Digital Libraries of Tomorrow
Moeller bosc2010 debian_taverna
Doing Less More Often: An Approach to Digital Strategy for Cultural Heritage ...
Online Assignment - Digital Resources
M.Sc. Research Proposal

What's hot (13)

DOC
柏林洪堡大學文件伺服器
PDF
Msc Proposal Presentation
PDF
Building A Scalable Open Source Storage Solution
DOC
D Space Proposal Tvm 1407
PPTX
Sept 24 NISO Virtual Conference: Library Data in the Cloud
PPT
Mist2012 panel discussion-ruo ando
PPTX
Climb bath
PDF
ArchivesSpace: Building a Next-Generation Archives Management Tool
PPTX
CLIMB System Introduction Talk - CLIMB Launch
PDF
Accessioning-Based Metadata Extraction and Iterative Processing: Notes From t...
PPTX
The Future of R&E networks and cyber-infrastructure
柏林洪堡大學文件伺服器
Msc Proposal Presentation
Building A Scalable Open Source Storage Solution
D Space Proposal Tvm 1407
Sept 24 NISO Virtual Conference: Library Data in the Cloud
Mist2012 panel discussion-ruo ando
Climb bath
ArchivesSpace: Building a Next-Generation Archives Management Tool
CLIMB System Introduction Talk - CLIMB Launch
Accessioning-Based Metadata Extraction and Iterative Processing: Notes From t...
The Future of R&E networks and cyber-infrastructure
Ad

Similar to Prometheus (20)

PPT
PRESERVATION Web archiving
PPT
Digital Libray
PPT
Using Fedora Commons To Create A Persistent Archive
PDF
In Search of Simplicity: Redesigning the Digital Bleek and Lloyd
PDF
Future Trends for Repositories
PPTX
Digital preservation and curation of information.presentation
PPTX
Completepresentation
PPTX
Desktop as a Service supporting Environmental ‘omics
DOCX
Digital library softaware greenstone & dsapce
PPTX
Preparation, Proceed and Review of preservation of Digital Library
PPT
MULTMEDIA DATABASE.ppt
ODP
Bodleian Library's DAMS system
PPT
An Introduction to Digital Preservation
PPT
Repositories and digital preservation
PPT
Proposed use of METS (Metadata Encoding & Transmission Standard) at National ...
PDF
Storage Made Easy - M-Stream File Transfer Acceleration
PPT
Access to electronic information resources in libraries
PPTX
What to curate? Preserving and Curating Software-Based Art
PPTX
Empowering Transformational Science
PRESERVATION Web archiving
Digital Libray
Using Fedora Commons To Create A Persistent Archive
In Search of Simplicity: Redesigning the Digital Bleek and Lloyd
Future Trends for Repositories
Digital preservation and curation of information.presentation
Completepresentation
Desktop as a Service supporting Environmental ‘omics
Digital library softaware greenstone & dsapce
Preparation, Proceed and Review of preservation of Digital Library
MULTMEDIA DATABASE.ppt
Bodleian Library's DAMS system
An Introduction to Digital Preservation
Repositories and digital preservation
Proposed use of METS (Metadata Encoding & Transmission Standard) at National ...
Storage Made Easy - M-Stream File Transfer Acceleration
Access to electronic information resources in libraries
What to curate? Preserving and Curating Software-Based Art
Empowering Transformational Science
Ad

More from National Library of Australia (20)

PPTX
Publicity and media - Anna Gressier & Sarah Kleven (Communications and Market...
PPTX
CHG recipient case study - Julia Mant of the National Institute of Dramatic Art
PPTX
Completing your CHG project - Fran D'Castro
PPT
Just Digitise It - Daniel Wilksch of the Public Records Office Victoria
PPTX
Trove - a window to our community heritage - Hilary Berthon of Trove, NLA
PPTX
National Archives of Australia
PPT
Disaster Prevention, Preparedness, Response and Recovery for Collections - Ki...
PPTX
Assessing Significance and Significance 2.0: an introduction - Margaret Birt...
PPT
Preservation Needs Assessment - Tamara Lavrencic
PPTX
Assessing the significance of cultural heritage - Tania Cleary
PPTX
Publicity, Media & Completing your CHG project - 2017 - Fran D'Castro
PPT
Just Digitise It - Daniel Wilksch of the Public Records Office Victoria
PPTX
TROVE - a window to our community heritage - Hilary Berthon of Trove, NLA
PPT
Disaster Prevention, Preparedness, Response and Recovery for Collections - Ki...
PPTX
CHG recipient case study - Donna Bailey of the Catholic Diocese of Sandhurst
PPT
Preservation Needs Assessment - Tamara Lavrencic
PPTX
Assessing the significance of cultural heritage - Tania Cleary
PDF
Significance Assessment and Significance 2.0: an introduction - Veronica Bull...
PPT
Preservation assessment - Tamara Lavrencic
PPT
Just digitise it - Daniel Wilksch of the Public Records Office Victoria
Publicity and media - Anna Gressier & Sarah Kleven (Communications and Market...
CHG recipient case study - Julia Mant of the National Institute of Dramatic Art
Completing your CHG project - Fran D'Castro
Just Digitise It - Daniel Wilksch of the Public Records Office Victoria
Trove - a window to our community heritage - Hilary Berthon of Trove, NLA
National Archives of Australia
Disaster Prevention, Preparedness, Response and Recovery for Collections - Ki...
Assessing Significance and Significance 2.0: an introduction - Margaret Birt...
Preservation Needs Assessment - Tamara Lavrencic
Assessing the significance of cultural heritage - Tania Cleary
Publicity, Media & Completing your CHG project - 2017 - Fran D'Castro
Just Digitise It - Daniel Wilksch of the Public Records Office Victoria
TROVE - a window to our community heritage - Hilary Berthon of Trove, NLA
Disaster Prevention, Preparedness, Response and Recovery for Collections - Ki...
CHG recipient case study - Donna Bailey of the Catholic Diocese of Sandhurst
Preservation Needs Assessment - Tamara Lavrencic
Assessing the significance of cultural heritage - Tania Cleary
Significance Assessment and Significance 2.0: an introduction - Veronica Bull...
Preservation assessment - Tamara Lavrencic
Just digitise it - Daniel Wilksch of the Public Records Office Victoria

Recently uploaded (20)

PDF
Basic Mud Logging Guide for educational purpose
PDF
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
PPTX
Cell Structure & Organelles in detailed.
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PDF
RMMM.pdf make it easy to upload and study
PPTX
Pharma ospi slides which help in ospi learning
PPTX
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PDF
TR - Agricultural Crops Production NC III.pdf
PDF
Classroom Observation Tools for Teachers
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PDF
Complications of Minimal Access Surgery at WLH
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PPTX
master seminar digital applications in india
PDF
O7-L3 Supply Chain Operations - ICLT Program
PPTX
Cell Types and Its function , kingdom of life
PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
Basic Mud Logging Guide for educational purpose
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
Cell Structure & Organelles in detailed.
FourierSeries-QuestionsWithAnswers(Part-A).pdf
human mycosis Human fungal infections are called human mycosis..pptx
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
O5-L3 Freight Transport Ops (International) V1.pdf
RMMM.pdf make it easy to upload and study
Pharma ospi slides which help in ospi learning
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
STATICS OF THE RIGID BODIES Hibbelers.pdf
TR - Agricultural Crops Production NC III.pdf
Classroom Observation Tools for Teachers
Microbial diseases, their pathogenesis and prophylaxis
Complications of Minimal Access Surgery at WLH
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
master seminar digital applications in india
O7-L3 Supply Chain Operations - ICLT Program
Cell Types and Its function , kingdom of life
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx

Prometheus

  • 1. Invited Demo: Prometheus: Managing the Ingest of Media Carriers Nicholas del Pozo Douglas Elford David Pearson Digital Preservation Digital Preservation Digital Preservation National Library of Australia National Library of Australia National Library of Australia Parkes Place, ACT 2600 Australia Parkes Place, ACT 2600 Australia Parkes Place, ACT 2600 Australia ndelpozo@nla.gov.au delford@nla.gov.au dapearso@nla.gov.au ABSTRACT number of widely used carrier types, any long-term solution has The National Library of Australia has a relatively small but to make provision for almost any kind of carrier, including carrier important collection of digital material stored on common carriers types which may not have been encountered yet. Moreover, this is such as floppy disks, CDs and DVDs. This includes both a constantly growing problem; if we don’t deal with the digital published material and unpublished manuscripts in digital form. materials that we have already collected, and ideally process new In the past, preservation of the Library’s physical format digital materials as a part of the acquisition process, accessing these collection has been taken care of manually, on a case-by-case carriers will soon become unmanageable, and eventually basis, but this approach is insufficient to deal effectively with the impossible. increasing volume of material requiring preservation. Factors such as obsolescence and carrier degradation already The Library has produced an application called Prometheus, make it difficult for digital preservation solutions to preserve which provides a semi-automated, scalable process for access to digital content. Additionally, due to the potential transferring data from carriers to preservation-managed digital volume and diversity of carriers and file formats, unless solutions storage. This is helping the Library to mitigate the major risks are robust and semi-automated, the digital data that it is currently associated with storing the content on physical carriers: possible to preserve may not be. To avoid exacerbating the deterioration of the media and obsolescence of the hardware problem, it is key that solutions deal with current common carrier required to access them. Prometheus makes it easier to process the types as efficiently as possible, while providing access to, or a majority of carriers commonly encountered in the Library and to mechanism for preserving, as many older carriers as is practical. collect and manage metadata about their content. Although not perfect, Prometheus is helping the Library to save digital content 2. PROMETHEUS before it is too late. To ensure access to digital content on the most common carriers within the Library, the Digital Preservation Workflow Project Keywords produced an application called Prometheus. This application Digital preservation, media carriers, National Library of provides a semi-automated, scalable process for transferring data Australia, obsolescence, open source software, Prometheus. from carriers to preservation-managed digital storage. This is helping the Library to mitigate the major risks associated with storing the content on physical carriers: deterioration of the media 1. INTRODUCTION and obsolescence of the technology required to access them. The National Library of Australia has a relatively small but Prometheus makes it easier to process the majority of carriers important collection of digital material stored on common carriers commonly encountered in the Library and to collect and manage such as floppy disks, CDs and DVDs. This includes both metadata about their content. It also provides mechanisms to published material and unpublished manuscripts in digital form. accommodate special cases, such as less common media types. In the past, preservation of the Library’s physical format digital Additionally, the original physical arrangement of a group of collection has been taken care of manually, on a case-by-case media can be recorded, even in those cases where a piece of basis, but this approach is insufficient to deal effectively with the physical media cannot be processed. increasing volume of material requiring preservation. Prometheus allows Library staff to link to catalogue records, The Library collects digital material through multiple acquisition create a byte-level image of the digital content, and transfer it to streams and generally has little control over the physical format in preservation-managed digital storage. Once the content is copied which the material arrives. So, while most items fall into a small from the carrier, the integrity of the image is verified, and as much metadata as possible is harvested. Attaching a customisable This work is licensed under the Creative Commons Attribution- ‘mini-jukebox’ (Figure 1) to a staff member’s workstation allows Noncommercial-No Derivative Works 3.0 Unported license. You are free to share this work (copy, distribute and transmit) under the following the accurate duplication of the content from a wider range of conditions: attribution, non-commercial, and no derivative works. To view carrier types, such as USB thumb drives, memory cards or 3½ a copy of this license, visit http://creativecommons.org/licenses/by-nc- inch floppy disks. It also provides more reliable hardware for nd/3.0/. imaging CDs, and DVDs. The digital preservation section can use DigCCurr2009, April 1-3, 2009, Chapel Hill, NC, USA Prometheus to deal with carrier types that fall outside this range, such as 5¼ inch floppy disks, SyQuest disks or hard drives. 73
  • 2. Figure 1. Library developer Snezana Mihajlovic uses a customised ‘mini-jukebox’ attached to a standard Library workstation (Photo: Douglas Elford, National Library). The system incorporates a range of open source tools to undertake processing, including carrier imaging (dd [1], cdrdao [2]); integrity calculation and checking (Jaxsum [3]); file identification (DROID [4]); and metadata extraction (JHOVE [5], NLNZ Metadata Extraction Tool [6]). These tools are deployed using Java-based web services. Moreover, Prometheus has been designed in a modular way, so that tools and services can be easily upgraded or replaced as new versions are released or better software becomes available (Figure 2). 3. THE SOFTWARE RELEASED Prometheus was designed for the Library’s specific environment, and therefore is not an ‘out of the box’ solution. However, it may be possible for other parties to use all or some of the requirements, other documentation or components. As such, the software has been released under the GNU General Public License V3.0. The latest version of Prometheus and its documentation is available from the project website [7]. A paper was presented on this project at the IFLA World Library and Information Congress in Quebec City, Canada, in August 2008 [8]. If we wait for the prefect system to be built, for the content on many carriers it will already be too late. Experience to date suggests that even though we all share the same fundamental Figure 2. General Process View. problem, the sheer volume and diversity of carriers, as well as varying individual collecting and business environments, makes it unlikely that there will ever be a single software solution that can be used by everyone. At least for the Library, Prometheus 4. ACKNOWLEDGMENTS provides a starting point to manage the ingest of, and preserve Our thanks to Gerard Clifton, Snezana Mihajlovic and Joseph content from problematic and sometimes idiosyncratic carriers for Mok, who worked with us on version 1.0 of Prometheus, and who long-term preservation, hopefully in a way that can advantage continue with the development work for version 1.4. others. This paper is based on the earlier paper, that appeared in 5. REFERENCES Gateways Dec 2008 [9]. [1] dd for Windows, at http://www.chrysocome.net/dd [2] cdrdao, at http://cdrdao.sourceforge.net/ [3] Jaxsum Java checksum utility, at http://sourceforge.net/projects/jacksum/ 74
  • 3. [4] DROID automatic file format identification tool, at [8] Elford, D., del Pozo, N., Mihajlovic, S., Pearson, D., Clifton, http://droid.sourceforge.net/wiki/index.php/Introduction G. and Webb, C. 2008. Media Matters: developing processes [5] JHOVE object validation environment, at for preserving digital objects on physical carriers at the http://hul.harvard.edu/jhove/ National Library of Australia. In World Library and Information Congress: 74th IFLA General Conference and [6] National Library of New Zealand Metadata Extraction Tool, Council 10-14 August 2008, Québec, Canada at http://meta-extractor.sourceforge.net/ www.ifla.org/IV/ifla74/papers/084-Webb-en.pdf. [7] Prometheus Sourceforge Website, at http://prometheus- [9] Pearson, D. 2008. Titans in the Library: Prometheus Unbinds digi.sourceforge.net/ At-risk Data. In Gateways Dec 2008. http://www.nla.gov.au/pub/gateways/issues/96/story02.htm. 75