Introduction to digital preservation

Last updated:  04 March 2024

A handy guide on the best strategies and practices to digitally preserve a Library's collection

What is digital preservation?

Digital preservation is the coordinated and ongoing set of processes and activities that ensure long-term, error-free storage of digital information, with means for retrieval and interpretation, for the entire time span the information is required. This section provides a brief background to the principles that underpin the currently accepted strategies.

Cultural institutions are increasingly devoting money and resources into building their digital collections, both by reformatting physical materials and by creating and acquiring digital originals.  All characteristics of these digital objects need to be available for the future – the data stream (bits and bytes), the "findability", the functionality and structural relationships between complex digital objects as well as the ability to display correctly and retain the "look and feel" of the original document, image, sound file or web page.  Ensuring the sustainability of these digital assets requires more than static storage and backup regimes, it requires the active management of this digital information over time to ensure its continued viability and accessibility. 
 
Digital preservation can be defined as:

The coordinated and ongoing set of processes and activities that ensure long-term, error-free storage of digital information, with means for retrieval and interpretation, for the entire time span the information is required.  

Born digital assets (digital originals with no analogue counterparts) are particularly vulnerable to potential loss from our cultural and heritage landscape.  While we still have physical documents from many hundreds of years ago, we are in danger of losing electronic documents and digital images created in the last decade.  Digital content is pervasive and powerful.  It is easy to create and to update but these characteristics also contribute to the challenge of preserving it for the future.

Digital preservation can appear a daunting challenge to collection managers, more so as the size, complexity and history of the collections increase.  This section is intended to identify the problem and provide a brief background to the principles that underpin the currently accepted strategies.  Links at the end of this section direct the reader to comprehensive information about Digital Preservation, both at an introductory and advanced level.

Digital Dark Age

The term 'Digital Dark Age' is often used to describe a scenario where vast amounts of digital information are lost or rendered permanently irretrievable.

Though the potential severity of this is open to debate, it is clear that the global library of knowledge and cultural heritage in digital forms is at risk.  

There are two critical reasons for developing and implementing digital preservation practices:

  1. physical deterioration of carrier media;
  2. technological obsolescence of hardware/software.

Physical deterioration

The media on which digital contents are stored are more susceptible to deterioration and catastrophic loss than some analogue media such as paper and microfilm. Digital storage media may deteriorate more rapidly and once the deterioration starts, there may already be data loss. A relatively small amount of media damage can cause file corruption and complete loss of data.  This characteristic of digital formats leaves a very short time frame for preservation decisions and actions.

Technological obsolescence

Rapid advances in storage and recording and playback technologies means hardware, software and file formats may become obsolete in a matter of years.  

Digital content created with such technologies is at great risk of loss, simply because it will become no longer accessible or cannot be correctly rendered.  

Lack of standards, protocols and proven methods for preserving digital information, as well as the prevalence of proprietary technology and file formats, adds to the problem of ensuring content is retrievable and useable in the future.

Guidelines for digital preservation

The following principles/components of a Digital Preservation Strategy have been proposed.

Use sustainable file formats

Successful digital preservation activities will depend on controlling the makeup of your digital repository (i.e. digital asset management database) and of your digital assets being of known types.  It is essential to create and acquire digital content that is in recommended file formats only.  Sustainable formats are those which comply with standards, are patent free, support metadata and interoperability and have a critical mass of user acceptance. Each type of media (image, audio, video etc) has a range of possible file formats that should be used. File format registries (e.g. PRONOM) have been created for the purpose of defining, assessing and selecting appropriate formats for a variety of digital content.

Authenticate digital objects

Once you have established your repository, ensure that archival master files match the attributes of recommended file formats. Various authentication modules are available (e.g. JHOVE2) as open-source software. They can analyse files prior to ingestion and compare attributes to known criteria (generally technical metadata specifications).

Use detailed and standardised metadata

In order to ensure long-term accessibility or resources one of the key activities is creating good quality preservation metadata. Preservation metadata are intended to store technical details on the format, structure and use of the digital content, the history of all actions performed on the resource including changes and decisions, the authenticity information such as technical features or custody history, and the responsibilities and rights information applicable to preservation actions.  It is essential that metadata should be OAIS compliant to allow interoperability, sharing and harvesting by other organisations and systems.

Replication

Creation of multiple copies of data at one or more locations and on one or more systems. Digital data is more likely to survive software or hardware failure, intentional or accidental alteration, and environmental catastrophes if it is replicated in several locations. Active management of replicated data is essential to control issues with version control and access over multiple locations.

Refreshing

The transfer of data between two types of the same storage medium while monitoring and maintaining data integrity. Refreshing will always be necessary due to the deterioration of physical media.

Migration

The transferring of data to a newer hardware and/or software environment.  This may include conversion of resources from one file format to another, from one operating system to another or from one programming language to another, so the resource remains fully accessible, and all functional characteristics are retained.

Emulation

The “look and feel” and functionality of legacy datasets/application or websites can be crucial to the value of them as digital objects. Emulation uses modern technologies to render the data as it was originally intended, even when older operating systems and infrastructure is no longer available. An alternative approach is to maintain older infrastructure and systems in a “technology museum.”

Sustainability

Active management – is the proactive and continuous data management that encompasses a range of strategies that contribute to the longevity of digital information. Digital sustainability focuses on building a flexible approach to data preservation with an emphasis on interoperability, standards, continued maintenance and continuous development.

Which of these principles are chosen to be implemented in any digitisation project will depend on the scope, purpose and available resources. Sound examples of digital preservation initiatives can be found via the websites of the British Library, National Library of New Zealand, the National Library of Australia and the Library of Congress. 

Large scale implementations – National Library of New Zealand

If funding and resources are available, partnering with credible vendors can result in comprehensive and nationally distributed solutions.

In 2003, new legislation gave the National Library of New Zealand the mandate to collect digital materials and “preserve the nation’s digital heritage in perpetuity". In 2006, the National Library of NZ successfully sought funding for a Digital Preservation system to embrace the needs of the country’s existing and planned digital asset collections.  As part of the national Digital NZ Strategy, the Library developed a comprehensive Digital Preservation system in collaboration with a commercial partner. The final product, Ex Libris Rosetta, [link] supports the acquisition, validation, ingest, storage, management, preservation, and dissemination of different types of digital objects. Digital New Zealand from The National Library of New Zealand, [link] provides an end-to-end solution to managing and preserving New Zealand's digital assets, both legacy collections and new, born-digital acquisitions, including electronic publishing and community created content.