DLF Logo 2011 DLF Forum

Digital Preservation Cloud Services for Libraries and Archives

Constellation E: Monday, October 31, 10:30 – 11:15AM

The amount of digital assets, whether born digital or digitized objects from analog and paper artifacts, is growing rapidly. Unlike companies which are required to retain their records for a relatively short period of time to comply with the Sarbanes-Oxley Act, national archives and digital libraries have to face daunting challenges of long-term preservation. Indeed, in order to fulfill the mission to provide discovery and access to digital assets over a long period of time, institutions must develop strategies and mechanisms to effectively preserving these assets. Besides the volume issue, another complicating aspect of digital preservation is data heterogeneity due to the fact that data might originate from various software products specific to diverse application domains. Moreover, organizations have increased their portfolios to disseminate a wide range of file formats from textual documents, geospatial images, audio visual records, web pages, and database files.

Within this context, the question is whether Cloud Computing paradigm can help digital archivists and librarians to meet the challenges of preservation. In recent years, Cloud Computing has gained momentum in the IT world thanks to the maturity of network protocol infrastructure, virtualization technology and a price-based Service Level Agreement structure. At the beginning, research studies have mostly focused on Cloud Storage as a potential service to be used in the digital library and archive community. In this paper, we will study the possibility of using Cloud paradigm throughout all components as specified in the Open Archive Information System (OAIS) reference model, from ingest, storage, data management, preservation, and access. The totality of such services can form what we call Long-Term Digital Preservation as a Service (LDPaaS). We will discuss how the major OAIS functions can leverage LDPaaS strengths, based on the inherent characteristics of elasticity, virtualization, pay-as-you-go resource utilization model of Cloud Services. More interestingly, our argument will show that large institutions have the potential to become LDPaaS providers, and smaller institutions can benefit from the available services. Lastly, we will propose a set of levels of service which can serve as a foundation for LDPaaS service level agreement in terms of ingest processing, preservation processing, and differentiated access capabilities. A Cloud Service, encompassing both storage and preservation of digital objects based on the user’s policies for the retention period, preservation level of service, and data confidentiality, can be an attractive alternative to self-provisioning for digital libraries and archives.

Resources

Session Leader

Quyen Nguyen is currently working in the Systems Engineering Division of the ERA Program Management Office at the U.S. National Archives and Records Administration. Before joining the National Archives, he has worked for telecommunications software companies. His experience is in developing software systems for large scale deployment. He has a BS in Computer and Information Science and Applied Mathematics from the University of Delaware and a MS in Computer Science from the University of California at Berkeley.