Digital Preservation Policy

Collection Policy

Policy Statement

This document describes the policies governing the acquisition, curation, and management of materials in York University Digital Library. These include digital objects created by YUL and published digital objects that require local hosting and rights management. York University Digital Library makes available collections that support research and scholarship according to the needs of the York University Digital Library Designated Community and in accordance with its Digital Preservation Strategic Plan.

Selection Criteria

For inclusion within York University Digital Library, objects must meet all Level One criteria and a consideration of Level Two criteria.

Level One criteria cover objective standards such as copyright, extent, and format issue.

Each item must:

  • Be in the public domain, or have documented, non-revocable permission granted by the copyright holder.
  • Be intended for non-commercial public viewing, and educational/research and use.
  • Be complete, such as an entire publication, article, etc., and not a "part" such as an abstract, forward, or title page
  • Meet the standards required for long-term digital curation, as described by documentation of York University Digital Library content types
  • Be in a standard format accessible through current file viewers or have a documented conversion path to move the format into a standard format
  • Be intended for permanent storage in York University Digital Library
  • With the exception of self-archiving of published works, items must be unique or novel, a similar or identical digital object should not already exist
  • Strive to meet accessibility standards as described by the Accessibility for Ontarians with Disabilities Act

Level Two criteria cover subjective elements requiring review, assessment, and the professional judgement of the Digital Initiatives Advisory Group in collaboration with subject experts. As an initial consideration and overarching principle, requests should fit within the mandate of YUL's general collection policy, including that of YorkSpace. Materials should also have broad and enduring value, as opposed to, for example, course-specific materials better housed elsewhere or material for which the demand does not go beyond specific or one-time usage. Research and/or administrative value is defined both locally and globally, with local (provincial and national) as a priority. Added value components acting as part of the consideration include degree of integration in an online environment, intellectual control (metadata), improvement of resource sharing, and enhancement of access. Finally, a request may receive merit based on its potential as a strategic opportunity, whether in teaching and learning or partnership and collaboration.

Retention and Evaluation

All objects included in York University Digital Library are intended to be retained permanently upon acceptance into York University Digital Library. York University Digital Library will not be used as a temporary storage facility for digital items.

Acknowledgements

Adapted from and inspired by:

License

CC0

CC0 1.0 Universal (CC0 1.0) Public Domain Dedication

Critical Processes and OAIS Mandatory Responsibilities

Critical Processes and OAIS Mandatory Responsibilities

  1. Introduction
    • This document traces critical processes employed by York University Libraries (YUL) to meet the "mandatory responsibilities" of a digital repository as described in OAIS. This document identifies which processes are necessary for the repository to fulfill its mandatory responsibilities.
  2. OAIS 3.1: "Negotiate for and accept appropriate information from information Producers."
    • YUL has a clearly defined process for negotiating with producers and ensuring that it acquires appropriate information. See the Rights Policy for more information.
  3. OAIS 3.1: "Obtain sufficient control of the information provided to the level needed to ensure Long-Term Preservation."
    • YUL obtains rights from individual producers that give the repository control over all of the information deposited by the producer. The nature and scope of these rights varies by submitter. In cases where the repository takes responsibility for the preservation of information, the rights include provisions for YUL to receive a local copy of the information and host it in perpetuity. In some cases, the repository obtains the right to modify information in order to ensure long-term preservation and accessibility. See the Rights Policy for more information.
  4. OAIS 3.1: "Determine, either by itself or in conjunction with other parties, which communities should become the Designated Community and, therefore, should be able to understand the information provided."
  5. OAIS 3.1: "Ensure that the information to be preserved is Independently Understandable to the Designated Community. In other words, the community should be able to understand the information without needing the assistance of the experts who produced the information."
  6. OAIS 3.1: "Follow documented policies and procedures which ensure that the information is preserved against all reasonable contingencies, and which enable the information to be disseminated as authenticated copies of the original, or as traceable to the original."
    • YUL has policies and procedures for the long-term preservation of information. See the Preservation Implementation Plan, and Definition of AIP for more information about the repository’s ingest, data management, and archival storages processes. AIPs are not deleted as a part of the repository’s normal operations.
    • The repository maintains backups of all content. See the Backup Plan for more details.
    • The repository is an integral component of disaster recovery planning for YUL.
    • YUL negotiates submission policies and procedures with individual producers. See the Rights Policy and the Definition of SIP for more information.
    • The repository has policies and procedures for the dissemination of information to its Designated Community. See the Definition of DIP for more information. To maintain the understandability and accessibility of disseminated information, YUL carries out extensive usability testing and solicits feedback from its Designated Community.
    • To ensure authenticity, each AIP is linked to a specific object and source file by information in the preservation metadata. This information is not visible to the Designated Community in the DIP, but can made available if necessary. The repository’s DIPs are always generated from a single AIP.
  7. OAIS 3.1: "Make the preserved information available to the Designated Community."
    • YUL disseminates the information to its Designated Community through its own user interfaces. See the Definition of DIP for more information about the repository's dissemination process. Depending on the license, access may be restricted to users affiliated with the York University community. See the Access Policy for more information.

Acknowledgements

Adapted from and inspired by:

License

CC0

CC0 1.0 Universal (CC0 1.0) Public Domain Dedication

Metadata Specifications

Metadata Specifications

1. Policy Statement

York University Digital Library requires thorough, well-structured metadata in order to preserve the content, relationships, activities and logical structure of the object.

2. Implementation

  • Each object in YUDL has a MODS (Metadata Object Description Schema) datastream to provide descriptive metadata.
  • Each object in YUDL has a TECHMD_FITS (see Registry of file formats) datastream to provide file identification and characterizationi information.
  • Each object in YUDL has a RELS-EXT (Releationship External) datastream that describes the object's relationship(s) to other object's in the repository.
  • Each object in YUDL has a POLICY (XACML) datastream for AuthZ.
  • YUDL utilizes the PREMIS (Preservation Metadata Implementation Strategy) vocabulary via Islandora PREMIS. PREMIS's data dictionary provides ways of describing objects and processes that are necessary for digital preservation. YUDL makes use of the objects, events and rights entities described in the PREMIS Data Model.

Acknowledgements

Adapted from and inspired by:

License

CC0

CC0 1.0 Universal (CC0 1.0) Public Domain Dedication

Review Cycle for Documentation Policy

Review Cycle for Documentation Policy

1. Policy Statement

The policies surrounding the operations of York University Digital Library as a preservation repository are subject to review and revision on two schedules: an ongoing basis and a cycle of regular review.

  • Ongoing
  • Regular Review
    • York University Libraries' preservation policy and all its related documents will be reviewed every two years in its entirety. This review will be led by the Digital Assets Librarian, in consultation with the Digital Initiatives Advisory Group.
  • Documentation history/versioning
    • All York University Libraries digital preservation policy and documentation is done in the Markdown format.
    • All York University Libraries digital preservation policy and documentation is version controlled using Git, and available in the York University Libraries GitHub Organization.

Acknowledgements

Adapted from and inspired by:

License

CC0

CC0 1.0 Universal (CC0 1.0) Public Domain Dedication

URI Policy

URI Policy

Policy Statement

URIs created by York University Digital Library

  • York University Digital Library uses a systematic convention to generate unambiguously unique identification for digital objects within its repository. This convention will create a stable name or reference to an object that can be permanently associated with that object, regardless of future changes to organizational structure or to digital access protocols.
  • This is in conformance with section 4.2.4 of Metrics for Digital Repository Audit and Certification (CCSDS, June 2009) which states that a compliant repository "shall have and use a convention that generates persistent, unique identifiers for all AIPs" and "its components."
  • This convention will ensure that “each AIP can be unambiguously found in the future” and that "each AIP can be distinguished from all other AIPs in the repository"

Implementation

Islandora object

York University Digital Library canonical URIs are consistently constructed in the following manner:

  • /islandora/object/PID

These URIs are aliased using Islandora Pathauto to the following pattern:

  • [fedora:pid]/[fedora:label]

Example:

  • Photograph: New Woodbine : racehorses train for opening of season
  • Canonical URI: http://digital.library.yorku.ca/islandora/object/yul:88675
  • Aliases URL: http://digital.library.yorku.ca/yul-88675/new-woodbine-racehorses-train-opening-season

Islandora object datastream

York University Digital Library object datastream canonical URIs are consistently constructed in the following manner:

  • /islandora/object/PID/datastream/DATASTREAM_NAME/view
  • /islandora/object/PID/datastream/DATASTREAM_NAME/download
  • [fedora:pid]/[fedora:label]/datastream/DATASTREAM_NAME/view
  • [fedora:pid]/[fedora:label]/datastream/DATASTREAM_NAME/download

Example:

  • Photograph: New Woodbine : racehorses train for opening of season
  • Canonical URI: http://digital.library.yorku.ca/islandora/object/yul:88675/datastream/JPG/view
  • Aliases URL: http://digital.library.yorku.ca/yul-88675/new-woodbine-racehorses-train-opening-season/datastream/JPG/download

Publicly available datastream names

Audio:

  • TN (Thumbnail)
  • PROXY_MP3 (Streaming quality MP3)

Book:

  • TN (Thumbnail)
  • ORIGINAL_PDF (Only for Buddhism Across Boundaries: Buddhist Periodicals and Books from Colonial Burma collection )

Images:

  • TN (Thumbnail)
  • JPG (Medium sized JPG)
  • OCR (OCR'd text)

Metadata:

  • MODS (Descriptive metadata)
  • DC (Descriptive metadata)
  • TECHMD_FITS (Technical metadata)
  • RELS-EXT (Fedora Object to Object Relationship)

Video:

  • TN (Thumbnail)
  • MP4 (Streaming quality MP4)

Web ARChive:

  • TN (Thumbnail)
  • JPG (Medium sized JPEG)
  • WARC_CSV (WARC Index)
  • WARC_FILTERED (WARC filtered)
  • OBJ (Warc)

Acknowledgements

Adapted from and inspired by:

License

CC0

CC0 1.0 Universal (CC0 1.0) Public Domain Dedication

Environmental Monitoring of Preservation Formats

Registry of file formats

Policy Statement

YUDL requires immediate identification of the type of file format submitted in order to help mitigate risk posed by format obsolescence. To this end, YUDL employs the use of DROID, JHOVE, file utility, Exiftool, PRONOM, NLNZ Metadata Extractor, ffident, and Tika through the FITS software package.

While YUDL is not dependent on or restricted to any particular format or group of formats, it aims to use well-known, widely accepted formats that support long-term preservation. If a submitter wants to use a specific format not meeting these criteria, an agreement must be reached between the submitter and YUDL.

Implementation Examples

YUDL makes use of FITS for format identification during the ingestion process where a file format is associated with each file.

Example characterization and reference to format registry:

<?xml version="1.0" encoding="UTF-8"?>
<fits xmlns="http://hul.harvard.edu/ois/xml/ns/fits/fits_output" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://hul.harvard.edu/ois/xml/ns/fits/fits_output http://hul.harvard.edu/ois/xml/xsd/fits/fits_output.xsd" version="0.7.4 (fits-mcgath fork)" timestamp="02/07/13 4:26 PM">
  <identification>
    <identity format="Tagged Image File Format" mimetype="image/tiff" toolname="FITS" toolversion="0.7.4 (fits-mcgath fork)">
      <tool toolname="Jhove" toolversion="1.9" />
      <tool toolname="file utility" toolversion="5.09" />
      <tool toolname="Exiftool" toolversion="9.13" />
      <tool toolname="NLNZ Metadata Extractor" toolversion="3.4GA" />
      <tool toolname="ffident" toolversion="0.2" />
      <tool toolname="Tika" toolversion="1.3" />
      <version toolname="Jhove" toolversion="1.9">5.0</version>
    </identity>
  </identification>
  <fileinfo>
    <size toolname="Jhove" toolversion="1.9">33543972</size>
    <creatingApplicationName toolname="Exiftool" toolversion="9.13">Adobe Photoshop Elements 2.0</creatingApplicationName>
    <lastmodified toolname="Exiftool" toolversion="9.13" status="CONFLICT">2007:03:09 11:00:49-05:00</lastmodified>
    <lastmodified toolname="Tika" toolversion="1.3" status="CONFLICT">2007-03-09T11:00:48</lastmodified>
    <filepath toolname="OIS File Information" toolversion="0.1" status="SINGLE_RESULT">/mnt/DIY/Archives/ASC/tiffs/02000-02999/ASC02000.tif</filepath>
    <filename toolname="OIS File Information" toolversion="0.1" status="SINGLE_RESULT">/mnt/DIY/Archives/ASC/tiffs/02000-02999/ASC02000.tif</filename>
    <md5checksum toolname="OIS File Information" toolversion="0.1" status="SINGLE_RESULT">b2b263bf5207481e42ac5945538ec985</md5checksum>
    <fslastmodified toolname="OIS File Information" toolversion="0.1" status="SINGLE_RESULT">1173456049000</fslastmodified>
  </fileinfo>
  <filestatus>
    <well-formed toolname="Jhove" toolversion="1.9" status="SINGLE_RESULT">true</well-formed>
    <valid toolname="Jhove" toolversion="1.9" status="SINGLE_RESULT">true</valid>
  </filestatus>
  <metadata>
    <image>
      <byteOrder toolname="Jhove" toolversion="1.9" status="SINGLE_RESULT">little endian</byteOrder>
      <compressionScheme toolname="Jhove" toolversion="1.9">Uncompressed</compressionScheme>
      <imageWidth toolname="Jhove" toolversion="1.9">7108</imageWidth>
      <imageHeight toolname="Exiftool" toolversion="9.13">4716</imageHeight>
      <colorSpace toolname="Jhove" toolversion="1.9">BlackIsZero</colorSpace>
      <orientation toolname="Jhove" toolversion="1.9" status="SINGLE_RESULT">normal*</orientation>
      <samplingFrequencyUnit toolname="Jhove" toolversion="1.9" status="CONFLICT">in.</samplingFrequencyUnit>
      <samplingFrequencyUnit toolname="Tika" toolversion="1.3" status="CONFLICT">Inch</samplingFrequencyUnit>
      <xSamplingFrequency toolname="Jhove" toolversion="1.9" status="CONFLICT">6000000/10000</xSamplingFrequency>
      <xSamplingFrequency toolname="Exiftool" toolversion="9.13" status="CONFLICT">600</xSamplingFrequency>
      <xSamplingFrequency toolname="NLNZ Metadata Extractor" toolversion="3.4GA" status="CONFLICT">600.0</xSamplingFrequency>
      <ySamplingFrequency toolname="Jhove" toolversion="1.9" status="CONFLICT">6000000/10000</ySamplingFrequency>
      <ySamplingFrequency toolname="Exiftool" toolversion="9.13" status="CONFLICT">600</ySamplingFrequency>
      <ySamplingFrequency toolname="NLNZ Metadata Extractor" toolversion="3.4GA" status="CONFLICT">600.0</ySamplingFrequency>
      <bitsPerSample toolname="Jhove" toolversion="1.9" status="CONFLICT">integer</bitsPerSample>
      <bitsPerSample toolname="Exiftool" toolversion="9.13" status="CONFLICT">8</bitsPerSample>
      <samplesPerPixel toolname="Jhove" toolversion="1.9">1</samplesPerPixel>
      <scanningSoftwareName toolname="Jhove" toolversion="1.9">Adobe Photoshop Elements 2.0</scanningSoftwareName>
      <YSamplingFrequency toolname="Tika" toolversion="1.3" status="SINGLE_RESULT">600.0</YSamplingFrequency>
    </image>
  </metadata>
</fits>

Acknowledgements

Adapted from and inspired by:

License

CC0

CC0 1.0 Universal (CC0 1.0) Public Domain Dedication

Definition of DIP

Definition of DIP

Dissemination Information Package (DIP)

  • OAIS describes a DIP as "the Information Package, derived from a part, or all, of one or more AIPs, received by the Consumer in response to a request to the OAIS."
  • York University Digital Library's (YUDL) DIPs are always generated from a single AIP.
  • User access to archival objects is provided through the [YUDL website](http://digital.library.yorku.ca](http://digital.library.yorku.ca).
  • The user, depending on their level of access, will may see basic object metadata, and an access version of the digital object.
  • Context information is provided in the form of links to other items in a given collection.
  • The DIP is retrieved using the URI for the corresponding AIP. In turn, the AIP contains metadata tying it back to the SIP.

Acknowledgements

Adapted from and inspired by:

License

CC0

CC0 1.0 Universal (CC0 1.0) Public Domain Dedication

Definition of AIP

Definition of AIP

Archival Information Package (AIP)

  • The information package consisting of the Content Information (CI), Preservation Description Information (PDI), Packaging Information (PI), and Descriptive Information (DI) that is archived by York University Libraries (YUL).
  • The level of content in a York University Digital Library (YUDL) AIP can vary, depending on the amount of content provided by the submitter.
  • This description will use the OAIS Information Model to illustrate completeness of our conceptual model, and will describe, in general terms, what a YUDL AIP looks like.

Content Information (CI)

  • The Content Data Object (CDO) is generally stored with from the primary preservation metadata file, which is held in Fedora Commons.
  • Representation Information is maintained, and contains information on the CDO's file format, version, and a reference to a format registry in order to provide information on how to interpret the file. See: registry of file formats

Preservation Description Information (PDI)

  • Reference Information - Identifiers are stored for each object identifying it globally (e.g. YUDL PID) and locally (e.g. URI).
  • Provenance Information - Provenance metadata is maintained for each object that provides a history of preservation events in the object's lifetime, beginning at ingest into the YUDL repository and referencing any preservation activities taken on the object (e.g., replacement due to corruption, format migration, etc.).
  • Context Information - As appropriate, information on how a CDO relates to other CDOs or to other conceptual entities. Examples of these relationships can include: a newer version of an object that supersedes an older one.
  • Fixity Information - Fixity information is generated at the time of ingest in order to later determine whether or not the item remains in the same state as when it was ingested. This information can be used to determine integrity of an object being copied within the system (as in the case of a change in storage location), or for periodic integrity checks.

Packaging Information (PI)

  • YUDL preservation metadata packages both the descriptive and preservation metadata together.

Descriptive Information (DI)

  • Depending on the type of CDO, the format of this descriptive metadata can vary (MODS or Dublin Core), but is selected to maximize findability. In all cases, the descriptive metadata will be recreated within the preservation metadata.

Acknowledgements

Adapted from and inspired by:

License

CC0

CC0 1.0 Universal (CC0 1.0) Public Domain Dedication