Fedora4 Beta Sprint 14: Enhanced Object and Datastream Versioning, Triplestore, Linked Data, Performance ++
Submitted by on Mon, 2014-04-28 12:43
Winchester, MA The Fedora team has concluded the fourteenth sprint within the "Beta Phase" of Fedora4 development. The work of this and each sprint is planned and completed thanks to the contributions of Fedora stakeholder institutions which allocate developer time. If you would like to be involved with Fedora4 development, please send an email to Andrew Woods email@example.com, or to firstname.lastname@example.org. If you have comments on the work from this sprint, please also send an email or comment directly on the wiki.
Read the the Sprint B14 summary:
About Sprint 14, Beta Phase
Kevin S. Clarke - University of California, Los Angeles
Michael Durbin - University of Virginia
Esme Cowles - University of California, San Diego
Longshou Situ - University of California, San Diego
Chris Beer - Stanford University
This sprint enhanced the object and datastream versioning capability  in two fundamental ways. Specifically, whereas the creation of new versions was previously supported, this sprint added the logical corollary capability of rolling back to or reverting  to a previous version.
Also, the deletion  of previous versions is now supported.
While the internal search index within Fedora 4 natively supports the ability to reindex on startup, the recommended pattern  for exposing a search experience to repository users did not support the ability to reindex the external Solr or triplestore indices prior to this sprint. This sprint introduced an HTTP endpoint  for triggering the reindex of external indices for:
• the entire repository
• a tree of resources within the repository, or
• a single repository resource
Beyond reindexing, this sprint also demonstrated the configuration  where there are more than one Fedora 4 repositories all feeding events into a single external triplestore.
3) Linked data
In the on-going effort to expose repository resources in a standardized, linked-data friendly manner, Fedora 4 continues to keep in step with the maturing Linked Data Platform (LDP) draft specification . Support for appropriate HTTP request headers which allow the user to indicate a preference for the comprehensiveness of triples found in responses was added. Likewise, appropriate HTTP response headers were added that specify paging  information and relationships between parent and child resources in an LDP fashion.
Tests were performed this sprint focused on the determination of whether there is an impact on object creation speeds with the increase in the number of child resources (object or datastream) under a single parent resource. These tests were run with multiple backend storage configurations to additionally assess what, if any, factor the storage backend plays into performance trends.
Thirty thousand objects (first with 1 KB datastreams, then with 2 MB datastreams) were created at the top level of the repository and individually timed using the following backends:
Although a slight up-tick in per object slowdown was indicated during the 2-MB tests, the trend was not absolutely conclusive. Further tests will be repeated with a greater number of objects.
The test results can be found on the wiki .
5) Test Coverage
Unit and Integration test coverage  is a vital factor in maintaining a healthy code base. The following are the code coverage statistics at the end of this sprint, and the change from the previous sprint.
• Unit tests: 73.1% (up 0.9%)
• Integration tests: 71.7% (up 0.2%)
• Overall coverage: 86.0% (up 0.7%)
Several feature enhancements and bugs were addressed during this sprint. Bug fixes and application polishing included:
• Added resource locking  for concurrent requests on same resource(s)
** Locking is available at single node and hierarchy tree levels
• Enhanced pluggability of external/internal identifier mappings
• Created a utility  for uploading a sample dataset to a running Fedora 4 repository
• Updated jms-indexer-pluggable integration tests to deploy Fedora webapp in addition to triplestore and indexer webapp
• Enhanced REST-API to return timestamp when creating objects and datastreams
• Improved HTTP caching headers
• Corrected multiple HTTP response codes
• Fixed authorization bug which prevented users with both reader and writer roles from reading a resource
• Fixed benchtool bug which prevented an equal number of HTTP client threads from being created to support the "num threads" option
7) Developer Capacity
For the short and long-term health of Fedora development, the committer base must continue to expand. This sprint witnessed the addition of two new developers, one from each of the University of California, Los Angeles and the University of California, San Diego.