Appendix 4: Notes on Adherence to Digital Preservation Standards
On this page
- Appendix 4: Notes on Adherence to Digital Preservation Standards
The following models and ancillary documents were used to frame the digital preservation program and specific actions within that program:
- National Digital Stewardship Alliance Levels of Digital Preservation
- Digital Preservation Coalition Rapid Assessment Model
- Reference Model for an Open Archival Information System (OAIS)
- Audit and Certification of Trustworthy Digital Repositories
- The Center for Research Libraries Trustworthy Repositories Audit & Certification: Criteria and Checklist
National Digital Stewardship Alliance (NDSA) Levels of Digital Preservation
The NDSA Levels of Digital Preservation is broken down into five functional areas and four levels, with each level indicating specific actions for storage, integrity, control, metadata, and content considerations. Adherence to the levels is indicated by green, yellow, and red highlighted notes. Green indicates adherence, yellow indicates in-progress work toward adherence, and red indicates work toward that item has not yet begun.
Level 1 – Know your content
Functional Area | Action | Notes |
---|---|---|
Storage | Have two complete copies in separate locations | As documented in the Storage & Backups section, backups of the Z: drive are also stored at HSL. |
Storage | Document all storage media where content is stored | This document outlines the processes for documenting incoming storage media and transmitting content to the Z: drive. Documentation and content is stored on the Z: drive. |
Storage | Put content into a stable storage | Content is extracted from all unstable media where possible in-house; media that cannot be transferred is documented and used to advocate for additional equipment. |
Integrity | Verify integrity information if it has been provided with the content | Integrity information will be verified whenever it is provided using Teracopy. |
Integrity | Generate integrity information if not provided with the content | Integrity information using md5 checksums is generated for content using siegfried upon accession. |
Integrity | Virus check all content; isolate content for quarantine as needed | All content is scanned using clamAV upon accession, which includes content ingest from storage media. Content flagged is quarantined and monitored by the Digital Archivist before being uploaded to secure storage. |
Control | Determine the human and software agents that should be authorized to read, write, move, and delete content | A permissions document consolidates who has read and write (including move and delete) permissions. The Digital Archivist authorizes the use permissions given to new users. |
Metadata | Create inventory of content, also documenting current storage locations | Content inventories are stored with content in a metadata folder and generated using siegfried. Storage locations are isolated to a single location: the Z: drive. |
Metadata | Backup inventory and store at least one copy separately from content | An inventory of storage media, not necessarily a detailed listing of the files contained on that media (pending processing level), is in ArchivesSpace. A detailed inventory of file-level information is currently not stored separately from content, but options are being examined as of 10/2022. |
Content | Document file formats and other essential content characteristics including how and when these were identified | File formats and characteristics are generated using siegfried and are stored in a metadata folder with the content. The date is stored as part of the brunnhilde report generated at the same time. |
Level 2 – Protector your content
Functional Area | Action | Notes |
---|---|---|
Storage | Have three complete copies with at least one copy in a separate geographic location | Two complete copies are stored at the Downtown Library (see the Storage & Backups section). A third copy is planned to be deployed to Amazon Glacier but is in the planning stages. |
Storage | Document storage and storage media indicating the resources and dependencies they require to function | Resource and dependency information is maintained by Systems Infrastructure. Changes or issues are conveyed to the Digital Archivist and addressed in tandem. |
Integrity | Verify integrity information when moving or copying content | Integrity information is verified when copying or moving content using Teracopy. A record of the verified checksums is stored in the logs folder for a collection. |
Integrity | Use write-blockers when working with original media | Write blockers are used with media upon content extraction. |
Integrity | Back up integrity information and store copy in a separate location from the content | A detailed inventory of file-level information, including integrity information, is currently not stored separately from content. Options for separate storage are being examined as of 10/2022. |
Control | Document the human and software agents authorized to read, write, move, and delete content and apply these | A permissions document consolidates who has read and write (including move and delete) permissions. The Digital Archivist authorizes the use permissions given to new users. |
Metadata | Store enough metadata to know what the content is (this might include some combination of administrative, technical, descriptive, preservation, and structural) | Metadata stored on Z: includes information not only about the files, but also about the media carriers, including technical, preservation, descriptive, and structural metadata. Additional administrative and descriptive metadata is stored in ArchivesSpace. |
Content | Verify file formats and other essential content characteristics | File formats are verified and any issues are currently flagged by siegfried. Issues are currently addressed at the time of scheduled access by a user, though options for addressing issues with file formats are currently being explored as part of processing. |
Content | Build relationships with content creators to encourage sustainable file choices | This document includes a section on preferred file formats for donors. Where possible, discussions with content creators occur prior to transfer and accession using these preferred file formats. |
Level 3 – Monitor your content
Functional Area | Action | Notes |
---|---|---|
Storage | Have at least one copy in a geographic location with a different disaster threat than the other copies | Implementation of Level 2 Storage actions as outlined in the Level 2 table will also meet this Level 3 Storage action. |
Storage | Have at least one copy on a different storage media type | Implementation of Level 2 Storage actions as outlined in the Level 2 table will also meet this Level 3 Storage action. |
Storage | Track the obsolescence of storage and media | Server hardware migrations and management is maintained by Systems Infrastructure. Changes, such as migrating to new servers and refreshing server hardware, are conveyed to the Digital Archivist and addressed in tandem according to a predetermined schedule. Collections content is extracted from all unstable storage media where possible in-house; media that cannot be transferred is documented and used to advocate for additional equipment. |
Integrity | Verify integrity information of content at fixed intervals | Action will be implemented in the future—currently, collection processing and generation of checksums must occur before verifying integrity information. |
Integrity | Document integrity information verification processes and outcomes | Action will be implemented in the future—currently, collection processing and generation of checksums must occur before verifying integrity information. |
Integrity | Perform audit of integrity information on demand | Integrity information is verified on demand using Teracopy. A record of the verified checksums is stored in the logs folder for a collection. |
Control | Maintain logs and identify the human and software agents that performed actions on content | Action will be implemented in the future—this information is currently held by Systems Infrastructure. |
Metadata | Determine what metadata standards to apply | Metadata standards and implementations are outlined in the Description section for born digital materials in this document. |
Metadata | Find and fill gaps in your metadata to meet those standards | Currently in progress—legacy processed collections are being reprocessed and new collections processed according to the metadata standards and implementations outlined in the Description section for born digital materials in this document. |
Content | Monitor for obsolescence, and changes in technologies on which content is dependent | Action will be implemented in the future. |
Level 4 – Sustain your content
Functional Area | Action | Notes |
---|---|---|
Storage | Have at least three copies in geographic locations, each with a different disaster threat | Implementation toward Level 2 and Level 3 Storage actions will occur before Level 4. |
Storage | Maximize storage diversification to avoid single points of failure | Implementation toward Level 2 and Level 3 Storage actions will occur before Level 4. |
Storage | Have a plan and execute actions to address obsolescence of storage hardware, software, and media | Server hardware migrations and management is maintained by Systems Infrastructure. Changes, such as migrating to new servers and refreshing server hardware, are conveyed to the Digital Archivist and addressed in tandem according to a predetermined schedule. Collections content is extracted from all unstable storage media where possible in-house; media that cannot be transferred is documented and used to advocate for additional equipment. |
Integrity | Verify integrity information in response to specific events or activities | Integrity information is verified on demand using Teracopy. A record of the verified checksums is stored in the logs folder for a collection. |
Integrity | Replace or repair corrupted content as necessary | Action will be implemented in the future. |
Control | Perform periodic review of actions/access logs | Action will be implemented in the future. |
Metadata | Record preservation actions associated with content and when those actions occur | Preservation actions are documented in the PREMIS Spreadsheet for a collection, stored in the administration folder for that collection. |
Metadata | Implement metadata standards chosen | Metadata standards and implementations are outlined in the Description section for born digital materials in this document. |
Content | Perform migrations, normalizations, emulation, and similar activities that ensure content can be accessed | Migrations, normalizations, emulations, and similar activities are primarily addressed at the time of scheduled access by a user. Typically, original files are retained to minimize irreversible interventions and support changing standards in providing access to file formats. |
Digital Preservation Coalition Rapid Assessment Model
While the NDSA Levels of Digital Preservation outlines specific actions, it does not include organizational and service characteristics of an archive. In DPC RAM, a digital preservation program has organizational and service capabilities that are assessed at the following levels: minimal awareness, awareness, basic, managed, and optimized.
The below table outlines steps to get to the subsequent level as progress toward the desired level. These steps will be incorporated into the WVRHC Digital Preservation Strategic Priorities document. The first instance of this document will be created in 2023.
Organizational Capabilities | |||
---|---|---|---|
Current Level | Desired Level | Steps to Get to Desired Level | |
A. Organizational viability: Governance, organizational structure, staffing and resourcing of digital preservation activities. | Awareness | Optimized |
|
B. Policy and strategy: Policies, strategies, and procedures which govern the operation and management of the digital archive. | Basic | Optimized |
|
C. Legal basis: Management of legal rights and responsibilities, compliance with relevant regulation and adherence to ethical codes related to acquiring, preserving and providing access to digital content. | Awareness | Optimized |
|
D. IT capability: Information Technology capabilities for supporting digital preservation activities. | Basic | Managed |
|
E. Continuous Improvement: Processes for the assessment of current digital preservation capabilities, the definition of goals and the monitoring of progress | Awareness | Optimized |
|
F. Community: Engagement with and contribution to the wider digital preservation community. | Awareness | Managed |
|
Service Capabilities | |||
Current Level | Desired Level | Steps to Get to Desired Level | |
G. Acquisition, Transfer and Ingest: Processes to acquire or transfer content and ingest it into a digital archive. | Awareness | Optimized |
|
H. Bitstream Preservation: Processes to ensure the storage and integrity of digital content to be preserved. | Awareness | Optimized |
|
I. Content Preservation: Processes to preserve the meaning or functionality of the digital content and ensure its continued accessibility and usability over time. | Awareness | Optimized |
|
J. Metadata Management: Processes to create and maintain sufficient metadata to support preservation, discovery and use of preserved digital content. | Awareness | Optimized |
|
K. Discovery and Access: Processes to enable discovery of digital content and provide access for users. | Awareness | Optimized |
|
Reference Model for an Open Archival Information System (OAIS)
To demonstrate OAIS compliance, it is critical to directly map our workflows to the functional entities and archival information package (AIP) as outlined in OAIS. Figures in this section are taken directly from the Reference Model for an Open Archival Information System (OAIS). David Giaretta has also written and created visualizations that link the discrete functional entities outlined below together in a way that is helpful for envisioning OAIS as a whole system. Images of the functional entity or package overview will come before a description of WVRHC adherence. Italicized areas are areas of improvement to meet OAIS standards.
Figure 4-1 has been included to demonstrate a simplified visual representation of how figures 4-2, 4-3, 4-4, 4-5, 4-6, and 4-7 link together.
The Ingest functional entity concerns actions related to formally accepting Submission Information Packages (SIPs) and generating Archival Information Packages (AIPs). Broadly, these areas map to the accession, appraisal, processing, and description portions of the born digital archival processing cycle outlined in this manual. Below is a detailed mapping of actions taken to adhere to these OAIS areas.
- Receive SIPs
- Documented in Imaging and Born Digital Content Acquisition Procedures.
- Quality assurance on SIPs
- Still establishing documentation for well-formed SIPs; currently SIPs are accepted as a simple file transfer of zipped materials to maintain as much file metadata as possible with minimal donor expertise required.
- Generate compliant AIP
- Documented in Processing workflow.
- Extract descriptive information from AIP
- Documented in Description workflow.
- Update information and content in Archival Storage and Data Management
- Documented in ArchivesSpace and in the Digital Media Inventory Template and PREMIS Spreadsheet for all media item and digital transfer SIPs.
The Archival Storage functional entity concerns actions related to storage, maintenance of content and storage infrastructure, and retrieval of AIPs. Broadly, these areas map to the access portion of the born digital archival processing cycle and the Digital Preservation Administration section in this manual. Below is a detailed mapping of actions taken to adhere to these OAIS areas.
- Receive AIPs from the Ingest entity
- AIPs are not auto-generated but are created by a SIP undergoing the Processing workflow.
- Add the AIP to permanent storage
- SIPs and AIPs are automatically added to permanent storage.
- Managing the storage hierarchy
- Conducted as part of Digital Preservation Administration processes.
- Refreshing the media on which AIPs are stored
- Conducted as part of Digital Preservation Administration processes in conjunction with Systems Infrastructure.
- Performing routine and special error checking
- Routine error checking is conducted annually as part of Digital Preservation Administration processes.
- Providing disaster recovery capabilities
- Conducted as part of Digital Preservation Administration processes in conjunction with Systems Infrastructure.
- Providing AIPs to the Access entity
- Completed as part of the Access workflow.
The Data Management functional entity concerns actions related to populating, maintaining, and accessing Descriptive Information which identifies and documents Archive holdings and administrative data used to manage the archive. Broadly, these areas map to the accession, appraisal, processing, and description portions of the born digital archival processing cycle and the Digital Preservation Administration section in this manual. Below is a detailed mapping of actions taken to adhere to these OAIS areas.
- Administering the Archive database functions (maintaining schema and view definitions, and referential integrity)
- Completed by the Digital Archivist as part of the annual review process for this document.
- Performing database updates (loading new descriptive information or Archive administrative data)
- Completed by the Digital Archivist or designated processor as part of the Processing workflow.
- Performing queries on the data management data to generate query responses
- Completed by the Digital Archivist in response to requests/needs using ArchivesSpace and other structured data generated as part of the full born digital workflow.
- Producing reports from these query responses
- Completed by the Digital Archivist in response to requests/needs using ArchivesSpace and other structured data generated as part of the full born digital workflow.
The Administration functional entity concerns actions related to providing the services and functions for the overall operation of the archive system. Broadly, these areas map to the accession and access portions of the born digital archival processing cycle and the Digital Preservation Administration section in this manual. Below is a detailed mapping of actions taken to adhere to these OAIS areas.
- Soliciting and negotiating submission agreements with producers/donors
- Conducted as part of the pre-accessioning process.
- Auditing submissions to ensure that they meet Archive standards
- Still establishing documentation for well-formed SIPs; currently SIPs are accepted as a simple file transfer of zipped materials to maintain as much file metadata as possible with minimal donor expertise required.
- Submissions are audited at the point of Accession and Appraised for whether they merit inclusion in the archive.
- Maintaining configuration management of system hardware and software
- Completed in coordination with Systems Infrastructure by the Digital Archivist.
- System engineering functions to monitor and improve Archive operations
- Completed by the Digital Archivist and, in terms of hardware, by the Digital Archivist in coordination with Systems Infrastructure.
- To inventory, report on, and migrate/update the contents of the Archive
- Completed by the Digital Archivist in coordination with relevant WVRHC employees.
- Establishing and maintaining Archive standards and policies
- Completed by the Digital Archivist in coordination with other WVRHC employees as needed.
- Providing customer support
- Completed by the Digital Archivist or authorized person.
- Activating stored requests
- Completed by the Digital Archivist or authorized person.
The Preservation Planning functional entity concerns actions related to monitoring the environment of the OAIS, providing recommendations and preservation plans to ensure that the information stored in the OAIS remains accessible to, and understandable by, the Designated Community over time. Broadly, these areas map to the processing portions of the born digital archival processing cycle and the Digital Preservation Administration section in this manual. Below is a detailed mapping of actions taken to adhere to these OAIS areas.
- Evaluating the contents of the archive and periodically recommending archival information updates
- Accomplished by Digital Archivist as part of daily work.
- Recommending the migration of current archive holdings
- For file formats: accomplished by the Digital Archivist in response to changing needs. For hardware: accomplished by Systems Infrastructure in coordination with the Digital Archivist.
- Developing recommendations for Archive standards and policies
- Accomplished by the Digital Archivist in coordination with relevant WVRHC employees.
- Providing periodic risk analysis reports
- Accomplished by the Digital Archivist as part of Digital Preservation Administration processes.
- Monitoring changes in the technology environment and in the Designated Community’s service requirements and knowledge
- Technology environment is actively monitored by the Digital Archivist, additional work needs to be done on articulating the Designated Community’s needs and knowledge.
- Designs Information Package templates and provides design assistance and review to specialize these templates into SIPs and AIPs for specific submissions
- Accomplished by the Digital Archivist and documented in this manual and ancillary documentation.
- Develops detailed Migration plans, software prototypes and test plans to enable implementation of Administration migration goals
- Instigated by the Digital Archivist in coordination with Systems Development, Systems Infrastructure, and relevant WVRHC employees.
The Access functional entity concerns actions related to providing the services and functions that support users/consumers in determining the existence, description, location and availability of information stored in the OAIS, and allowing users/consumers to request and receive information products. Broadly, these areas map to the processing, description, and access portions of the born digital archival processing cycle. Below is a detailed mapping of actions taken to adhere to these OAIS areas.
- Communicating with users/consumers to receive requests
- Accomplished by Reference Staff or the Digital Archivist as part of standard processes outlined in Access procedures within each section of this document.
- Applying controls to limit access to specially protected information
- Accomplished as part of Processing procedures.
- Coordinating the execution of requests to successful completion
- Accomplished by Reference Staff or the Digital Archivist as part of standard processes outlined in Access procedures within each section of this document.
- Generating responses (Dissemination Information Packages, query responses, reports) and delivering the responses to users/consumers
- Accomplished by the Digital Archivist as part of standard processes outlined in Access procedures within each section of this document. Access copies are generated based upon the type of AIP.
In addition to the above functional entities, the information objects and packages must include the following aspects and information
The above diagram maps to our information package structure, outlined in the Accessioning Media Workflow, used internally as follows:
- Package Description: The information intended for use by Access Aids.
- This is the information uploaded to ArchivesSpace; information may be organized prior to being uploaded to ArchivesSpace using the Born Digital Processing Checklist.
- Packaging Information: The information that is used to bind and identify the components of an Information Package. For example, it may be the ISO 9660 volume and directory information used on a CD-ROM to provide the content of several files containing Content Information and Preservation Description Information.
- This information is initially documented in the Digital Media Inventory Template and expanded in the Brunnhilde report stored in the Metadata folder in the collection folder.
- Content Information: A set of information that is the original target of preservation or that includes part or all of that information. It is an Information Object composed of its Content Data Object and its Representation Information.
- Data Object: Either a Physical Object or a Digital Object.
- This information is stored in the Content folder in the folder containing digital collections content and metadata on the Z: drive.
- Representation Information: The information that maps a Data Object into more meaningful concepts. One example is JPEG software which is used to render a JPEG file.
- Structure Information: The Representation Information that imparts meaning about how other information is organized.
- Documented through description processes or self-documented through strategic file naming where applicable.
- Semantic Information: The Representation Information that further describes the meaning beyond that provided by the Structure Information.
- Documented using Siegfried as part of Brunnhilde for file characterization and stored in the Metadata folder for the collection.
- Structure Information: The Representation Information that imparts meaning about how other information is organized.
- Data Object: Either a Physical Object or a Digital Object.
- Preservation Description Information: The information which is necessary for adequate preservation of the Content Information.
- Reference Information: The information that is used as an identifier for the Content Information.
- This information is initially documented at the media item or transfer level in the Digital Media Inventory Template and included in ArchivesSpace; single files are yet given individual identifiers beyond checksums.
- Provenance Information: The information that documents the history of the Content Information.
- Documented in accession records in ArchivesSpace at the time of transfer of ownership.
- Context Information: The information that documents the relationships of the Content Information to its environment. This includes why the Content Information was created and how it relates to other Content Information objects.
- Documented through processing and description processes and stored in ArchivesSpace.
- Fixity Information: The information which documents the mechanisms that ensure that the Content Information object has not been altered in an undocumented manner.
- Documented through Siegfried as part of Brunnhilde and stored in the Metadata folder for the collection.
- Access Rights Information: The information that identifies the access restrictions pertaining to the Content Information, including the legal framework, licensing terms, and access control.
- Documented in ArchivesSpace as part of processing and description processes.
- Reference Information: The information that is used as an identifier for the Content Information.
As the digital preservation program is fairly new, only a high level overview of OAIS compliance is available. The portion of the OAIS standard related to repository responsibilities will be outlined in the Audit and Certification of Trustworthy Digital Repositories section, currently a work in progress.
Audit and Certification of Trustworthy Digital Repositories (future area of work)
In the future, this section will include an overview of WVRHC adherence to the Audit and Certification of Trustworthy Digital Repositories (CCSDS 652.0-M-1) set of recommended practices to implement the Open Archival Information System (OAIS) Reference Model (ISO 14721). As a complement to the former document, the Trustworthy Repositories Audit & Certification: Criteria and Checklist document by OCLC and The Center for Research Libraries was also used to determine compliance with CCSDS 652.0-M-1. Adherence to these standards will be used as a tool for institutional accountability.