Fedora4 Team Completes Sixth Beta Phase Sprint: Authorization, Clustering and Large Files
Submitted by on Mon, 2013-11-11 11:20
Winchester, MA The Fedora team has completed the sixth sprint within the "Beta Phase" of Fedora4 development. The work of this and each sprint is planned and completed thanks to the contributions of Fedora stakeholder institutions which allocate developer time. If you would like to be involved with Fedora4 development, please send an email to Andrew Woods firstname.lastname@example.org, or the Fedora Steering Committee email@example.com. If you have comments on the work from this sprint, please also send an email or comment directly on the wiki.
Read the the Sprint B6 summary:
About Sprint 6, Beta Phase
Greg Jansen - University of North Carolina, Chapel Hill
Osman Din - Yale University
Eric James - Yale University
frank asseg - FIZ Karlsruhe
A. Soroka - University of Virginia
Chris Beer - Stanford University
Scott Prater - University of Wisconsin
The initial pattern for Fedora 4 Authorization is that a given user request will have already been Authenticated before entering the Fedora 4 application. Authenticated user requests are expected to contain an identity and zero or more additional attributes, such as groups. These combined user attributes (in addition to other attributes which may be mapped to the requesting user) along with the requesting action are compared against configurable rules to determine if the user has the privilege to perform the action on the resource.
Administrators can apply rules to arbitrary resources (e.g. object, datastream, or node hierarchy) within the repository using the restricted Access Roles REST API . Once the access rules have been defined on repository resources, the Basic Roles-based Policy Enforcement Point  (if enabled) will restrict requests as described above and in further detail on the wiki .
This feature is available and ready for community Acceptance Testing .
Continuing with the theme of benchmarking and optimizing performance with Fedora 4 configured in a cluster, this sprint established clustering resources within the following institutions:
- FIZ Karlsruhe
- Yale University
- University of North Carolina, Chapel Hill
- University of Wisconsin
- Amazon Web Services
Also produced during this sprint was tooling for the consistent benchmarking of different clusters, documentation of cluster configuration, and some preliminary benchmarking results. The work of profiling and analyzing cluster performance remains for a future sprint.
Details of the cluster resources and benchmarking results can be found on the wiki .
3) Large Files
The native "projection" or "federation" capability offered by Fedora 4's underlying JCR implementation (ModeShape ) allows for content on the filesystem, in a database, web-accessible, etc., to be connected to and exposed through the repository. The results of testing this capability over multi-gigabyte files in Sprints B2  and B4  showed performance bottlenecks.
One of the advantages of leveraging the opensource ModeShape under Fedora 4 is that we are able to push improvements upstream to that project. Modifications and enhancements to ModeShape's FileSystemConnecter  from the Fedora 4 team were incorporated into ModeShape 3.6.0.
While significant performance improvements have been realized by these updates, further testing and description of recommended patterns-of-practice are outstanding.
Additional details of the "large files" approach and documented performance improvements can be found on the wiki .
4) Repository Refinements
Several miscellaneous, yet important, improvements where made during this sprint to enhance the deployment and operations of a Fedora 4 repository.
- REST API  was expanded to include endpoints for:
Copying and/or Moving resources or trees of resources within the repository;
Retrieving and Creating repository node types (i.e. Content Models)
- Default locations are now provided to enable zero-required configuration to deploy the Fedora 4 web-application to a servlet container
Additionally, the codebase has been systematically refactored to allow the iteration over repository resources. So far, this capability is exposed through the REST API for a subset of repository services.
 https://wiki.duraspace.org/display/FF/Access+Roles+Module  https://wiki.duraspace.org/display/FF/Basic+Role-based+PEP  https://wiki.duraspace.org/display/FF/Design+Guide+-+Policy+Enforcement+...  https://wiki.duraspace.org/display/FF/Acceptance+Testing  https://wiki.duraspace.org/display/FF/Test+Clusters  https://docs.jboss.org/author/display/MODE/Federation  https://wiki.duraspace.org/display/FF/Sprint+B2+Summary  https://wiki.duraspace.org/display/FF/Sprint+B4+Summary  https://docs.jboss.org/author/display/MODE/File+system+connector  https://wiki.duraspace.org/display/FF/Design+-+Large+Files#Design-LargeF...  https://wiki.duraspace.org/display/FF/REST+API">https://wiki.duraspace.o... https://wiki.duraspace.org/display/FF/Basic+Role-based+PEP