One of the key challenges to preserving electronic records in a meaningful way is preserving the authenticity and integrity of records during their movement from a recordkeeping system to a preservation system. This Ingest Guide describes the actions needed for a trustworthy ingest process. This process enables an Archive and Producer to move records from a recordkeeping system to a preservation system in a manner that allows a presumption of authenticity.
This Ingest Guide refers to ingest broadly, defining it as the entire process involved in moving records from a recordkeeping system to a preservation system. This process consists of the Producer and Archive agreeing to and defining what records will be transferred and the manner of the transfer, validation, and transformation. Following the Guide should help an Archive and Producer ensure the functional, not just byte-stream preservation, of records. Not only does the Guide articulate steps for ensuring that records are properly tracked and have maintained their structural integrity during ingest, it also provides a way for the Archive to ensure that records remain renderable, functional, and meaningful. Following the Guide should enable an Archive to have a trustworthy ingest process, which would allow a reasonable person to presume that a record has maintained its level of authenticity during ingest.
This guide does not describe the functional or technical requirements for building either a recordkeeping or a preservation system. Instead, this guide presents a detailed description of the complex ingest workflow step by step. For more on authenticity and trustworthy recordkeeping systems and the preservation of records see "Requirements for Trustworthy Recordkeeping Systems and the Preservation of Electronic Records in a University Setting."1
The Ingest Guide contains two main sections. Section A, Negotiate Submission Agreement, details how the Producer and the Archive create and arrange a Submission Agreement that defines the terms and conditions of the transfer of records from the Producer to the Archive, and it details the scope of the records along with the nature of their validation and transformation. Section B, Transfer and Validation, details the actual transfer, validation, and transformation of records. Section A contains eleven parts and Section B has six parts. Each part in Section A and B is composed of a number of steps.
Each part includes a narrative summary, a flowchart illustrating all of its steps, and a description of each step. Each description includes an Overview, a list of Components, Resources, Products, and Documentation that each step utilizes and/or produces, and a thumbnail flowchart.
The Ingest Guide also includes a separate section on Components, Resources, Products, and Documentation that describes each of these roles in the ingest process and refers to the steps that use and produce them. The Ingest Guide also has a Submission Agreement section that explains the Agreement in further detail. Finally, the Guide includes crosswalks between its own steps of the Producer-Archive Interface Methodology Abstract Standard.2
Although the Ingest Guide is a prescriptive guide for a trustworthy ingest process, it is not a detailed manual of procedures. The implementation of the Guide can produce a wide variety of procedures and policies from archive to archive. The Guide describes the actions that must be undertaken to trust the ingest process and prescribes how to undertake these steps at a high level, but it does not prescribe how to proceed in full detail. For example, Step A5.5 calls for the Archive to choose a preservation format for records it chooses to transform, but it does not dictate what those preservation formats should be. An Archive following the Ingest Guide will still have to determine what preservation formats best serve its needs. The Guide points out many tasks that Archives must undertake to have a trustworthy ingest processes, without discussing those tasks in detail. The most prominent of these tasks include the details of the appraisal process in Part A3, the creation of Submission Information Packages in Part B1, and the creation of Resources. For more on the implementation of the Ingest Guide, see Appendix A: Using the Ingest Guide.
The Guide uses the Open Archival Information System Reference Model (OAIS) definition of Archive: "An organization that intends to preserve information for access and use."3 Therefore, while an Archive may be an archives in the sense used by the archival community, it does not necessarily have to be an archives.4 In the context of the Ingest Guide, an Archive is any type of office or juridical body that has the responsibility of providing long-term preservation and access to records. Like OAIS, the Guide refers to a single ingest or a single set of recurring ingests as an Ingest Project.
The Ingest Guide also uses the OAIS definition of Producer: "The role played by those persons, or client systems, who provide the information to be preserved. This can include other OAISs or internal OAIS persons or systems."5 This means that Producer will normally be the custodian of the recordsor an entity the Producer has authorized to act on its behalfthat has the authority to transfer the records to the Archive. The Producer may or may not be the individual, group, or organization that is responsible for the creation, production, accumulation, or formation of the records it transfers to the Archive.
The Ingest Guide is based upon the work of the Consultative Committee for Space Data Systems (CCSDS) and builds upon its OAIS framework. In particular, Section A of the Ingest Guide is based on the CCSDS's Producer-Archive Interface Methodology Abstract Standard, which is composed of four phases: Preliminary, Formal, Transfer, and Validation. The Producer-Archive Interface is a follow-up document to OAIS. The Preliminary and Formal phases greatly expand on the Negotiate Submission Agreement activity in the Administration function of OAIS. The Transfer and Validation phases reiterate the Receive Submission and Quality Assurance activities respectively, both of which are in the Ingest function of OAIS.
As a product of the Fedora and the Preservation of University Records grant project (NHPRC 2004-083), the Ingest Guide is designed primarily for a university setting. However, its general nature may also make it applicable in other environments. The university orientation of the Ingest Guide differs from the Producer-Archive Interface's orientation. While the PAI treats the creation of a Submission Agreement formally and in two different phases ("Preliminary" and "Formal"), the Ingest Guide does not distinguish preliminary and formal phases, because this level of formality is unnecessary. Such formality would impose unrealistic implementation expectations in a university setting.
The Ingest Guide is based largely on the conceptual underpinnings of the records lifecycle model, presuming that a Producer will create, acquire, utilize, and manage records in a Recordkeeping System to suit its current business needs, and later the Archive will ingest some of those records into a separate Preservation System that the Archive administers. In this model, the Archive acts as a neutral third party to the recordkeeping process acting on behalf of broader societal needs rather than on behalf of the Producer. As a neutral third party the Archive has no stake in the content of the records and no reasons to alter records under its custody, and it should not allow anybody to alter the records either accidentally or on purpose. Many archivists have rejected the lifecycle model in favor of the records continuum concept, where recordkeeping is seen as a continuous process that is not time-based, separated into a series of clearly defined steps, or administered by completely separate juridical entities. Many Producers and Archives operate in a mixed world between these two models. For example, many Archives operate separately from a Producer but are part of same organization as the Producer and do not act as a neutral third party. The Ingest Guide should be useful to most Archives operating in a mixed lifecycle/continuum environment, particularly ones where separate Recordkeeping and Preservation Applications are maintained.
The Ingest Guide assumes that a Producer is submitting managed records to an Archive. Traditionally, an archives might accept boxes of un-organized paper records from a faculty member, for example, with the idea that the archives could add these records to its processing backlog and later impose some sort of order, or arrangement, long after the transfer. It is the assumption of this Guide, and the corresponding preservation requirements, that such delayed arrangement of electronic records is neither scalable nor sustainable. A box of unlabeled disks sent from a faculty member illustrates this point. The Ingest Guide places the activity of imposing order on electronic records outside of the ingest and preservation activities. The work of preparing organized and managed records for transfer to the Archive is the Producer's responsibility and, in the case of the box of unorganized diskettes, the Archive is doing the Producer's job, imposing order on the records after the fact. In a situation like this, during part A of the Ingest Guide, the Archive would either require the Producer to organize the disks and the records they hold before the transfer takes place, or artificially organize the records after accepting them from the Producer but still before the transfer to the preservation repository. By imposing this artificial arrangement the Archive has become a Producer and thus plays both roles in the Ingest process.
1 Fedora and the Preservation of University Records (NHPRC 2004-083), "Requirements for Trustworthy Recordkeeping Systems and the Preservation of Electronic Records in a University Setting," <http://dca.tufts.edu/features/nhprc/reports/1_5final.pdf>.
2 Consultative Committee for Space Data Systems, Producer-Archive Interface Methodology Abstract Standard, CCSDS 651.0-B-1, Blue Book, May 2004. <http://www.ccsds.org/CCSDS/documents/651x0b1.pdf>.
3 ISO 14721:2003, Space data and information transfer systems -- Open Archival Information System -- Reference model.
4 For the "archival community" definition of the term archives see Richard Pearce-Moses, A Glossary of Archival and Records Terminology (Archival Fundamentals Series II), (Chicago: Society of American Archivists, 2005).
5 ISO 14721:2003, p. 112.
6 The authors do not wish to express any opinion of the relative merit of either the lifecycle or continuum models, but instead simply disclose their inherent bias towards the lifecycle model based on educational background and work experience.
© Tufts University and Yale University, 2006.