The DSpace User Group Meeting at OAI7
Submitted by on Wed, 2011-06-29 13:48
From Bram Luyten, @mire.
Statistics and Repository Exposure
The day kicked off with two practical talks related to repository search engine optimization and statistics. Isidro F. Aguillo, the driving force behind the Webometrics ranking introduced the changes to the algorithm for the upcoming July 2011 ranking. He made a strong analogy between an institution's IP rights and URI domain. Basically, when you allow content and links, originally owned by your institution, to wander off into other domains (hdl.net, youtube, …) your institutions effectively loses search engine exposure for those materials. To clarify this with an example, let's say that I'm a researcher funded by an institution, and I choose to host my scientific blog on www.wordpress.com, this will be very good for the ranking of wordpress for the keywords that are identified on my pages. But the fact that it doesn't contribute to the overall exposure of my institution's domain and search engine ranking, is becoming more and more of a problem. –Isidro F. Aguillo
Robert Tansley, one of the original creators of both the DSpace and Eprints software currently works at Google. He gave some practical insights on how search engines look at repositories and don't treat them very differently then they would treat another website. He made a strong case for uniformity and keeping all links that DSpace comes with out of the box (in particular, the Browse by pages are heavily used by crawlers to index your content). His slides will be available soon, after internal Google approval for public release. –Robert Tansley
DSpace 1.8 Feature Preview
After the break, @mire's Ben Bosman and Lieven Droogmans presented a summary of DSpace 1.8 related presentations that were recently given at Open Repositories 2011. The Configurable Workflow will be @mire's major contribution to the 1.8 release. This framework will allow more flexible workflow implementations in DSpace, compared to the limited enabling/disabling of Edit Metadata steps that DSpace ships with today. The framework was illustrated with examples of review by invitation, a workflow in which non-dspace users can be invited by email to do a one time task. An interesting highlight from this idea is that fully automated steps are also supported. In this example, the step of evaluating different score based reviews, leading to an automated reject or accept decision, illustrated these automated steps.
Please refer to our earlier report on Curation Tasks in DSpace for information about this feature.
Several features are added to DSpace 1.8 to support integration of DSpace with Duracloud. One of the key components, already present in DSpace 1.7, is the export of DSpace items, collections & communities as individual AIPs (Archival Information Packages). Compared to a backup/restore procedure that involves the whole database and/or assetstore, using AIPs enables much more granular operations. In the past, a repository manager would really scratch his or her head after a collection or community of items would be destroyed by an overzealous admin with the batch edit feature. Instead of having to restore the whole repository to its state the day before, AIPs enable more granular restores. Duracloud comes in when you would be interested to store these AIPs in the cloud. The benefit of Duracloud, compared to other solutions in this area, is that it's so tailored to smoothly work together with DSpace and Fedora.
The University of Edinburgh contributed to the SWORD implementation in DSpace. The contribution and innovation in DSpace 1.8 will enable DSpace as a Sword CLIENT, while it can already act as a Sword SERVER since DSpace 1.5. To further clarify this, the sword server capabilities since 1.5 already allowed that some external application could push content into DSpace, using the Sword v1 protocol. What's new now, with the client capabilities, is that you will be able to push content FROM your DSpace into another application that supports the Sword v1 protocol. In the XMLUI interface, this feature will be surfaced from the context menu in the navigation bar, with a link "Copy Item to Another Repository" from the item page. –@mire co-founders Ben Bosman & Lieven Droogmans
A lot of work has been done by Bojan Suzic on building a REST API for DSpace. This completely new API is also planned for inclusion in DSpace 1.8. This feature has no visible elements in the web user interface of DSpace, as it's a programmer interface to access DSpace content from within other applications. In a REST API, all requests are being formed as HTTP URL's and responses are returned either in XML or JSON. This API makes it a lot easier for your developers to integrate repository content in other applications or websites. For the current ongoing work, Bojan is being supported by a Google Summer of Code student, Vibhaj Rajan.
Repository Manager Session
Three interesting use cases were presented in short, 20-minute presentations. The Windmusic.org example featured a very rich subject browsing and discovery implementation. The work on AgriOcean DSpace aims to provide an easy to use, customized version of DSpace for use in Agricultural and Oceanografic institutions in developing countries. As a last case, INIST-CNRS presented how DSpace is being used in different ways in a national multi-faceted research organization and how they collaborated with a service provider to tackle specific challenges like upgrading.
This session ended with a 30 minute brainstorm activity chaired by Iryna Kuchma (EIFL) in which the audience was divided into groups, each addressing large organizational and policy topics. In retrospect, the audience was too diverse and the time was too short to cover some serious ground in this session. Still, it enabled people to network and recognize common challenges.
Notes for some of the discussion groups are available here: http://bit.ly/jbJDq5
Bram Luyten (@mire) zoomed in on the organizational differences between Repository Introduction projects and actually managing an Operational Repository as a service to your end users. He identified some of the roles (repository administrator, systems administrator, developer) that could be part of a team. Closely linking with these roles, he illustrated the benefits and risks of having these roles in-house versus out sourced to a service provider.
Michele Kimpton, Duraspace, introduced the Duraspace organization (the result of the merger between the Fedora Commons organization and the DSpace Foundation). She also explained where the funding of the organization comes from and how it relies on its sponsorship programme for institutions, registered service provider program and new in 2011, providing commercial DuraCloud services.
Thanks and Acknowledgements
Thanks go out to all attendees, the presenters and the OAI7 organizing committee (especially Jean-Blaise Claivaz) for giving us the chance to link this meeting to the OAI7 conference. Acknowledgements to everyone collaborating on DSpace 1.8, please check out the slides for full credits.