Fedora™ Digital Object Construction Guide

Software Release 1.2.1

Fedora™ Development Team

$Id: do-const.dbx,v 1.12 2004/04/16 12:57:45 rlw Exp $


Table of Contents

The Fedora Digital Object Model
Three Types of Fedora Digital Objects
Application Programming Interface for Object Creation (API-M)
Creation of Data Objects (3 ways):
Creation of Behavior Definition and Mechanism Objects (2 ways)
Rules for Encoding a Fedora Object XML Submission Package

The Fedora Digital Object Model

All Fedora objects conform to the Fedora digital object model which is described in detail elsewhere (Fedora publications). Briefly, the Fedora object model consists of:

  • PID - a persistent, unique identifier for the object

  • Datastream(s) – the component in a Fedora object that represents MIME-typed content. An object can have one or more Datastreams. The content of a Datastream can be either data or metadata, and this content can either be stored internally in the Fedora repository, or stored remotely (in which case Fedora holds a pointer to the content in the form of a URL).

  • System Metadata – a set of system-defined attributes that are necessary to manage and track the object in the repository.

  • Disseminator(s) - the component in a Fedora object that associates a service with the object for the purpose of transforming or presenting the content represented by Datastreams. An object can have zero or more Disseminators. Each Disseminator points to a Behavior Definition and a Behavior Mechanism. A Behavior Definition defines an abstract set of methods to which the object is said to “subscribe.” A Behavior Mechanism defines the concrete bindings to a service that implements the abstract methods of the Behavior Definition.

Three Types of Fedora Digital Objects

Although every Fedora digital object conforms to the Fedora object model, as described above, there are three distinct types of Fedora digital objects that can be stored in a Fedora repository. The distinction between these three types is fundamental to how the Fedora repository system works. Basically, in Fedora, there are objects that store digital content entities, objects that store service descriptions, and objects that store service binding information.

Data Objects

In Fedora, a Data Object is the type of object used to represent a digital content entity. Data Objects are what we normally think of when we imagine a repository storing digital collections. Data Objects can represent such varied entities such as images, books, electronic texts, learning objects, publications, datasets, and many other entities. One or more Datastreams represent the parts of the digital content entity. One or more Disseminators represent services that can present different views or transformations of the content entity.

Behavior Definition Objects

In Fedora, a Behavior Definition Object is the type of object used to represent an abstract service definition in the form of an abstract set of methods. This is similar to the notion of an interface in Java. From the Fedora perspective, a Behavior Definition Object defines a “behavior contract” that one or more Data Object may “subscribe” to. A Data Object is said to subscribe to a particular behavior contract when its Disseminator points to the PID of a given Behavior Definition Object.

Behavior Definition Objects are stored in the repository just like other Fedora objects. Although the Fedora repository system is able to identify Behavior Definition Objects as special “utility” objects, it stores and manages them just like Fedora Data Objects. Also, clients can access these objects in the same manner they access Data Objects.

Behavior Mechanism Objects

In Fedora, a Behavior Mechanism Object is the type of object used to represent a concrete service definition. From the Fedora perspective a Behavior Mechanism Object represents a service that fulfills the requirements of a behavior contract defined by a Behavior Definition. The combination of a Behavior Definition and Behavior Mechanism constitutes a Disseminator on a Data Object. Together, they provide the means for associating a set of behaviors with a Fedora Data Object.

For a Disseminator to work, a Data Object must be associated with particular service that is an “implementation” of a behavior contract it to which the object “subscribes.” Thus, a Data Object’s Disseminator not only points to the PID of a given Behavior Definition Object, it also points to the PID of particular Behavior Mechanism Object.

A Behavior Mechanism Object stores several forms of metadata that describe a set of methods and the runtime bindings for invoking these methods. The most significant of these metadata formats is service binding information encoded in the Web Services Description Language (WSDL). Fedora uses WSDL to “normalize” the view of a service. This enables Fedora to talk to a variety of different services in a predictable and standard manner. It also contains metadata that defines a “data contract” between the service and any object that associates with the service. The data contract (also known as the “Datastream Input Specification”) specifies the kind of datastreams that must be available in the data object to serve as input to the various methods of the service. Given that a typical use of a service is to transform or present the datastream content of a Data Object, it is necessary to define datastreams as “input parameters” to the service methods.

Behavior Mechanism Objects are stored in the repository just like other Fedora objects. Although the Fedora repository system is able to identify Behavior Mechanism Objects as special “utility” objects, it stores and manages them just like Fedora Data Objects. Also, clients can access these objects in the same manner they access Data Objects.

Application Programming Interface for Object Creation (API-M)

Digital object construction is achieved via the methods of the Fedora Management API (API-M). The Management API exposes methods to ingest objects into the repository, as well as methods to create objects interactively. For a details description of each of the methods in API-M, see the specification at http://www.fedora.info/definitions/1/0/api/. The API is also expressed in the Web Services Description Language (WSDL) and is available at http://www.fedora.info/documents/Fedora-API-M.wsdl

Creation of Data Objects (3 ways):

XML Submission (Ingest)

  • a. Process summary: An XML submission package is prepared outside of Fedora. This can be done manually with an XML editor, or programmatically. The submission package is sent to the Fedora repository for “Ingest” via one of the Fedora clients that support the ingest function.

  • b. XML submission format: A Fedora-specific extension of METS 1.0. See rules for encoding Fedora objects in METS below.

  • c. Clients for ingest

    • Admin GUI - From the menu select File/Ingest. You will be prompted to select a file from the file system so make sure your submission package XML file is on a drive that is visible from the machine on which the Admin GUI is running. For more details, see the guide entitled “Java Application: Admin GUI – Object Creation and Repository Management via API-M” in the Client Documentation.

    • Batch Loader via Admin GUI – From the menu, select Tool/Batch/IngestBatch. This will provide a means to select a target directory where one or more METS submission packages reside. Multiple objects can be loaded via this utility. See the guide entitled “Admin GUI -- Batch Utility for Object Creation” in the Client Documentation.

    • AutoIngester Script

Interactive Building

  • a. Process summary: A user can create a digital object through a visual metaphor using the Admin GUI java application. The Admin GUI provides a New Object command and an Object Editor that can be used to edit objects.

  • b. Clients for interactive building

    • Admin GUI - From the File menu, select New, then Object.

Batch Loading

  • a. Process summary: Batches of similar objects can be created by defining a set of templates and targets. This is fully described in the Client Documentation section of the Fedora documentation.

  • b. Clients for batch loading

    • Admin GUI – From the menu, select Tools/Batch. For more information see the guide entitled “Java Application: Admin GUI -- Batch Utility for Object Creation” in the Client Documentation.

Creation of Behavior Definition and Mechanism Objects (2 ways)

XML Submission (Ingest)

  • a. Process summary: Just like with Data Objects, an XML submission package is prepared outside of Fedora for Behavior Definition and Mechanism Objects. This can be done manually with an XML editor, or programmatically. The submission package is sent to the Fedora repository for “Ingest” via one of the Fedora clients that support the ingest function.

  • b. XML submission format: A Fedora-specific extension of METS 1.0. Within the METS file, inline XML metadata sections will be created in accordance with the following XML schema

    • Fedora Method Map [for Behavior Def and Mech]

      Namespace: http://fedora.comm.nsdlib.org/service/methodmap

      Schema Loc: %FEDORA_HOME%\dist\server\xsd\methodmap.xsd

    • OAI Dublin Core [for Behavior Def and Mech]

      Namespace:"http://www.openarchives.org/OAI/2.0/oai_dc/

      Schema Loc: http://www.openarchives.org/OAI/openarchivesprotocol.html

    • Web Services Description Language (WSDL) [for Behavior Mech]

      Namespace: http://schemas.xmlsoap.org/wsdl/

      Schema Loc: http://www.w3.org/TR/wsdl

    • Fedora Datastream Input Spec [for Behavior Mech]

      Namespace: http://fedora.comm.nsdlib.org/service/bindspec

      Schema Loc: %FEDORA_HOME%\dist\server\xsd\bindspec.xsd

  • c. Clients for ingest: (see clients listed in Data Object section above)

Interactive Building

  • a. Process summary: A user can create Behavior Definition and Behavior Mechanism objects using the Admin GUI java application. The Admin GUI provides builders which act as a “wizards” for creating objects in a one-up manner. Behind the scenes the builders create METS submission packages containing all of the requisite metadata formats for method definitions and service bindings.

  • b. Clients for interactive building

    • Admin GUI - From the menu, select Builders/BDefBuilder or Builders/BMechBuilder. These will open up a respective wizard for creating new objects. For more details see the guide entitled “Java Application: Admin GUI -- Behavior Definition and Mechanism Builders” in the Client Documentation.

    • AutoIngester Script

Rules for Encoding a Fedora Object XML Submission Package

The XML submission format for Fedora objects is an extension of Metadata Encoding and Transmission Standard (METS). More information on METS can be found at http://www.loc.gov/standards/mets/. As of Fedora 1.0, the repository will only accept a Fedora-specific extension of the METS 1.0 Schema. We made a few minor additions to METS 1.0 to accommodate the requirements of Fedora. In the future, we plan to accept XML submissions encoded to the METS 1.3 Schema, which will have changes that accommodate the few Fedora-specific extensions to METS. In the mean time, we validate the Fedora Object XML submissions against the Fedora extension of METS which is published at: http://www.fedora.info/definitions/1/0/mets-fedora-ext.xsd.

Since METS was designed to be very generic and support a variety of uses, the rules of the METS Schema are very general-purpose. Fedora objects must conform to other rules that are beyond the scope of what is expressed in the METS schema. Therefore, the Fedora Object XML submissions will also be validated against a set of Fedora-specific rules that are expressed using the Schematron language (link for Schematron). Internally, the repository will use Schematron to enforce these rules on incoming XML submission packages. The Schematron rules are expressed in XML and can be found in the Fedora server distribution at:

%FEDORA_HOME%\dist\server\schematron\fedoraRulesExt.xml.

For convenience and ease of understanding we have enumerated the Fedora rules in plain English below:

Data Object Encoding Rules:

Encoding by hand requires a pretty good understanding of METS, although it can be done by following the patterns in the demo objects that come with the Fedora distribution. Demo objects are located at: %FEDORA_HOME%\dist\client\demo.

  • General attributes:

    • a. On METS root element, the OBJID attribute will represent the Fedora object PID. Normally, this should be left empty so that the Fedora repository can generate a new PID. However, you can assign test/demo PIDs by inserting a value in OBJID that begins with “demo:” or “test:” for example, “demo:100”

    • b. On METS root element, the value of the TYPE attribute must be “FedoraObject” ”

    • c. On METS root element, the value of LABEL serves as the official description of the object. If there is no Dublin Core record present in the object, the Fedora repository will use this label to populate the title element of a baseline Dublin Core record for the object.

    • d. On METS root element, the PROFILE element can be used by institutions to classify different types of Fedora data objects.

    • e. On the METS:metsHdr element the CREATEDATE attribute is optional since the Fedora repository will assign this at ingest time. However, if a CREATEDATE attribute exists in the submission package it must be populated with a date value in the following format: YYYY-MM-DDThh:mm:ss. The same thing goes for LASTMODDATE.

    • f. On the METS:metsHdr element the RECORDSTATUS should be set to “I” to indicate the METS serves as an “Ingest” package for Fedora.

    • g. It is recommended that the METS:agent element be populated with information about the agent responsible for submitting the METS file as a submission format for Fedora to ingest.

  • Datastreams:

    • a. To create a proper section for Datastreams in the METS file, the METS:fileSec must have a single child METS:fileSec element whose ID attribute has the value “DATASTREAMS”

    • b. Datastreams that are encoded in the METS:fileSec must follow the following pattern to establish proper version groups and datastream IDs. Each datastream has its own METS:fileGrp whose ID attribute is the official datastream ID. The recommended convention is ID=”DSn” where n is a number (for example ID=”DS1” or ID=”DS2).”

    • c. Within a METS:fileGrp, there can be one or more METS:file elements to represent different versions of a datastream. As of Fedora 1.2, versioning of data objects is supported. The METS:file element for the datastream must have and ID attribute that represent the version number relative to the datastream ID set in the METS:fileGrp. The recommended convention is ID=”DSn.v” where n is the number of the datastream and v is the version number (for example ID=DS1.0 or ID=DS1.1).

    • d. The METS:file element for a datastream must have a CREATED attribute whose value is populated with a date in the following format: “YYYY-MM-DDThh:mm:ss”.

    • e. The METS:file element for a datastream must have a MIMETYPE.

    • f. The METS:file element for a datastream must have an OWNERID attribute. In Fedora, the OWNERID attribute is used to encode the Datastream Control Group. The following are valid values:

      • 1. “M” – Managed Content. This tells the repository to store the datastream’s content byte stream inside the repository. When the METS:file contains “M” on the OWNERID, the repository will resolve the URL associated with the METS:file element and pull the content into the repository for permanent storage. Fedora will establish its own local identifier for retrieving the content, and disregard the original URL that came in on the METS submission package.

      • 2. “E” - External Referenced Content. This tells the repository to store the URL for the datastream content, not the content byte stream itself. For this type of datastream, Fedora does not actually manage or have custodianship of the content, but it manages the link to the content and some basic metadata about it.

      • 3. “R” – Redirected Content. Like “E” this tells the repository to store the URL for the datastream content, not the content byte stream itself. More importantly, it tells the repository not to mediate or proxy this content at runtime. Instead, the repository will redirect to the URL at run time. This is desirable when a datastream points to a streaming media source, or to a complex web page where some components are lost during proxying.

  • Inline XML Datastreams:

    • a. Datastreams can also be encoded in the METS:dmdSecFedora and METS:amdSec. These are considered “inline XML datastreams” in Fedora. The METS:dmdSecFedora and METS:amdSec elements act as datastream version group containers just like the METS:fileGrp acts for regular datastreams. Within these elements, the METS “metadata section” elements (i.e., METS:techMD, METS:rightsMD, etc.) are used for the specific version instances of the inline metadata datastreams, just like the METS:file acts for regular datastreams. The datastream IDs work the same way, where the ID attribute on the container element acts as the datastream ID, and the ID on the metadata section element acts as the datastream version ID.

    • b. Do not use the schemaLocation attribute in the root element of inline XML datastreams (within METS:mdWrap element).

  • Dublin Core Record Datastream:

    • a. A Dublin Core (DC) record is optional in the Fedora object submission package. If one is not provided the repository will automatically create a minimal DC record in the object by using the LABEL (on METS root) as the DC title element. It will also use the object PID as the DC identifier element.

    • b. If a DC record is provided in the METS submission package it should be encoded within a METS:dmdSecFedora. The dmdSecFedora element will act as the datastream version group container. It MUST have an ID attribute whose value is “DC” to be recognized by Fedora!

    • c. Within the METS:dmdSecFedora, there must be one METS:descMD element. This element is part of the Fedora extension of METS 1.0 and is used to encode a specific version of the DC datastream within the version group container. The ID attribute on the METS:descMD element MUST have the value “DC1.0” to be recognized by Fedora.

    • d. The actual DC metadata should be encoded using the Open Archives Initiative (OAI) Dublin Core schema.

  • Disseminators

    • a. Each Disseminator is encoded in its own METS:behaviorSec element. The METS:behaviorSec element acts as a version container for different versions of the Disseminator. As of Fedora 1.2.1, only one version is supported. Each Disseminator must have a disseminator ID which is encode in the ID attribute of the METS:behaviorSec. The recommended convention is ID=”DISSn” where n is a number (for example ID=”DISS1” or ID=”DISS2).”

    • b. The METS:serviceBinding element represents a particular version of the disseminator. Again, in Fedora 1.2.1 only one version is supported. The element must have and ID attribute that represent the version number relative to the Disseminator ID that is set in the METS:behaviorSec. The recommended convention is ID=”DISSn.v” where n is the number of the Disseminator and v is the version number (for example ID=DISS1.0 or ID=DISS1.1).

    • c. The METS:serviceBinding element must have a CREATED attribute whose value is populated with a date in the following format: “YYYY-MM-DDThh:mm:ss”.

    • d. The METS:serviceBinding element must have a STRUCID attribute. The value of this attribute the ID of a METS:structMap section in the submission package. The METS:structMap section constitutes the Fedora “Datastream Binding Map” which identifies the Datastreams in the object that will be used by the Disseminator. Specifically, these are the datastreams that fulfill the “data contract” defined by the Behavior Mechanism Object that is pointed to by the Disseminator.

    • e. The METS:structMap, in turn, points to Datastreams in the object, and gives them a special name via the TYPE attribute of the METS:structMap. Again, the METS:structMap encodes the fulfillment of the “data contract” that the Behavior Mechanism object specifies so that datastreams can act as input parameters to service methods (described earler).

    • f. To make a Disseminator point to a Behavior Definition Object (to make the object subscribe to a “behavior contract”), there must be a single METS:interfaceMD element as a child to the METS:serviceBinding element. The METS:interfaceMD element must have a LOCTYPE attribute whose value is “URN” and an xlink:href attribute whose value is the PID of a Fedora Behavior Definition Object.

    • g. To make a Disseminator point to a Behavior Mechanism Object (to associate a particular service that runs a behavior contract’s methods), there must be a single METS:serviceBindMD element as a child to the METS:serviceBinding element. The METS: serviceBindMD element must have a LOCTYPE attribute whose value is “URN” and an xlink:href attribute whose value is the PID of a Fedora Behavior Mechanism Object.

Behavior Definition Object Rules

The best way to create these objects is via the wizard builder in Admin GUI which requires no knowledge of the XML formats.

Encoding by hand requires a pretty good understanding of METS and the Fedora Method Map schema. However, it can be done by following the patterns in the demo objects that come with the Fedora distribution..

Demo objects are located at: %FEDORA_HOME%\dist\client\demo.

  • General attributes: follow the rules for Fedora Data Objects with the following exceptions:

    • a. On METS root element, the value of the TYPE attribute must be “FedoraBDefObject”

    • b. On METS root element, the value of PROFILE must be “fedora:BDEF”

  • Datastreams

    • a. There must be at least one datastream that represents some sort of documentation about the Behavior Definition. This should be encoded in the METS:fileSec just like datastreams for Data Objects. See the Data Object rules above.

  • Inline XML Datastreams:

    • a. Fedora Method Map: There MUST be an inline xml datastream containing Fedora Method Map metadata. This must be encoded within a METS:amdSec whose ID attribute MUST have “METHODMAP” as its value. Note that there are some elements and attributes in the Fedora Method Map schema that do not apply to it use in Behavior Definition Objects. Do not encode the following in the Method Map of a Behavior Definition because they apply only to a Method Map in the context of a Behavior Mechanism:

      • i. Do not use element DefaultInputParm

      • ii. Do not use element DatastreamInputParm

      • ii. Do not use element MethodReturnType

      • iii. Do not use attributes wsdlMsgName and wsdlMsgOutput on the Method element.

    • b. OAI Dublin Core: Like other data objects there can be an optional Dublin Core record within a METS:dmdSecFedora. See Data Object rules.

  • Disseminators

    • a. Behavior Definition Objects use what is known as the “Bootstrap Disseminator.” The implementation Bootstrap Disseminator is built into the Fedora system. It’s purpose is to allow the contents of Behavior Definition Objects to be disseminated via the Fedora Access API. In Fedora 1.1.1, the Bootstrap Disseminator functionality is not turned on. However, it must still be encoded in the METS submission package.:

    • b. In terms of encoding the bootstrap disseminator, follow the examples in the demo objects. The METS:behaviorSec and the METS:structMap are static, meaning that they will be the same in every Behavior Definition Object. It is recommended that these sections just be copied into new objects from a demo object.

Behavior Mechanism Object Rules

The best way to create these objects is via the wizard builder in Admin GUI which requires no knowledge of METS or special XML metadata formats.

Encoding by hand requires a pretty good understanding of METS, the Web Services Description Language (WSDL), the Fedora Method Map schema, and the Fedora Datastream Input Specification schema. However, it can be done by following the patterns in the demo objects that come with the Fedora distribution. Demo objects are located at: %FEDORA_HOME%\dist\client\demo.

  • General attributes: follow the rules for Fedora Data Objects with the following exceptions:

    • a. On METS root element, the value of the TYPE attribute must be “FedoraBMechObject”

    • b. On METS root element, the value of PROFILE must be “fedora:BMECH”

  • Datastreams

    • a. There must be at least one datastream that represents some sort of documentation about the Behavior Mechanism. This should be encoded in the METS:fileSec just like datastreams for Data Objects. See the Data Object rules above.

  • Inline XML Datastreams:

    • a. Fedora Method Map: There MUST be an inline xml datastream containing Fedora Method Map metadata. This must be encoded within a METS:amdSec whose ID attribute MUST “METHODMAP” as its value. The following elements and attribute must have referential integrity with the WSDL datastream described below:

      • i. On Method element, the value of attribute operationName must match the value of the name attribute on a wsdl:operation element.

      • ii. On Method element, the value of the attribute wsdlMsgName must match the value of the name attribute on a wsdl:message element that represents the messaging format in WSDL for a particular service method.

      • iii. On Method element, the value of the attribute wsdlMsgOutput must match the value of the name attribute on a wsdl:message element that represents the messaging format in WSDL for the response of a particular service method.

      • iv. On elements DatastreamInputParm, UserInputParm, and DefaultInputParm, the value of the parmName attribute must match the name attribute on a wsdl:part element (which is a child of a wsdl:message).

    • b. OAI Dublin Core: Like other data objects there can be an optional Dublin Core record within a METS:dmdSecFedora. See Data Object rules.

    • c. Web Services Description Language (WSDL): this is used to encode the service binding metadata for the methods expressed in the Fedora Method Map.

    • d. Fedora Datastream Input Spec: this is used to encode the fulfillment of the “data contract” specifying the kinds of datastreams that must be available in a Data Object that uses this Behavior Mechanism Object. These datastreams will ultimately act as input parameters the methods described in the Method Map and the WSDL.

  • Disseminators

    • a. Behavior Mechanism Objects uses what is known as the “Bootstrap Disseminator.” The implementation Bootstrap Disseminator is built into the Fedora system. It’s purpose is to allow the contents of Behavior Mechanism Objects to be disseminated via the Fedora Access API. In Fedora 1.1.1, the Bootstrap Disseminator functionality it not turned on. However, it must still be encoded in the METS submission package.

    • b. In terms of encoding the bootstrap disseminator, follow the examples in the demo objects. The METS:behaviorSec and the METS:structMap are static, meaning that they will be the same in every Behavior Mechanism object. It is recommended that these sections just be copied into new objects from a demo object.