Fedora is a community-maintained, open-source repository system that supports durable access to digital objects. In addition to its functionality, Fedora has the following characteristics that touch upon its application in the long-term preservation context.
- Community-led: Fedora is a product by and of the global cultural heritage community; libraries, archives, and museums around the world use Fedora for diverse purposes with disparate data types, and they collectively drive, develop, and maintain Fedora to meet their information management needs.
- Open-source: Fedora is made available under the widely used open-source Apache 2.0 license. Open-source code is more visible to more people, and is openly governed; there is no single organization upon which Fedora relies for its long-term vibrancy or survival.
- Standards-based: Fedora leverages existing, widely used standards to ensure long-term sustainability, preferring standards developed by the wider Web world to bespoke or niche ones.
- Interoperable: Fedora allows digital objects to conform to common data models and provides access via RESTful APIs to support interoperability and long-term viability.
This document outlines the functionality provided by Fedora that supports digital preservation practice. Coupled with the above characteristics, it is clear why Fedora is often a key component of a long-term digital preservation strategy.
Fedora provides the following features in support of preservation
- Persistence1
- Deposited files are stored on the filesystem in a predictable location based on their checksums, which allows operating system-level access to files independent of Fedora
- Metadata, including structural metadata, describing objects are stored in Fedora’s database and can be exported on demand or at runtime (see below)
- Fixity2
- A range of digest algorithms (MD5, SHA-256, etc.) can be configured for calculating, storing, and recalculating checksums
- A default algorithm can be configured
- An algorithm can be specified on resource creation
- An algorithm can be changed after resource creation
- A SHA-1 checksum is calculated and stored on file ingest
- If a SHA-1 checksum is provided with an ingest request, it will be compared with the calculated value on ingest. A mismatch will cancel the ingest
- The SHA-1 checksum can be recalculated and compared with the stored value at any time
- A range of digest algorithms (MD5, SHA-256, etc.) can be configured for calculating, storing, and recalculating checksums
- Audit
- Preservation metadata, modeled using the RDF-based PREMIS and PROV-O ontologies, may optionally be generated by repository events and stored in the repository; these are also accessible via external triplestores for enhanced querying.
- Versioning
- Versions can be created for both objects and files
- Versions are created on request
- Previous versions can be restored
- Import/export
- Fedora imports and exports resources in standardized RDF serializations. This functionality is aimed at supporting the following objectives:
- Integration with preservation systems external to Fedora
- Avoidance of “platform lock-in”
- Re-import of exported data to support disaster recovery
- Fedora imports and exports resources in standardized RDF serializations. This functionality is aimed at supporting the following objectives:
Upcoming Fedora features to enhance support for preservation
- Fixity Enhancement
- A checksum for an object can be calculated, stored, and recalculated at any time
- Versioning Enhancement3
- Individual versions of resources can be exported and imported
1 Starting with Fedora 4.7.0, the persistence stack below Fedora’s community code will consist of ModeShape writing files to disk based on their checksum and metadata being written to a user-defined database. PostgreSQL and MySQL have been verified.
2 “Fixity: Property that a Digital Object has not been changed between two points in time”. Premis Editorial Committee. Premis Data Dictionary Version 2.0, March 2008, p.212. http://www.loc.gov/standards/premis/v2/premis-2-0.pdf [accessed 26Oct2016]
3 There are potentially many options of versioning functionality that could be exposed to the user. Each variation introduces complexity and long-term maintenance burden. One of Fedora’s development goals is to encourage a simple and sustainable code base. A fundamental decision point in this regard to versioning was: Should Fedora offer on-demand versioning or auto-versioning? It was eventually agreed that offering on-demand versioning allowed users to choose whether they wanted versions on every update or at specific points in a resource’s lifecycle.