Background Buzz: Open Repositories 2014
Submitted by on Thu, 2014-06-19 10:43
Paasitorni Congress, site of Open Repositories 2014
Winchester, MA This year’s Open Repositories Conference began on June 9 under sunny Helsinki skies that last far into the night, and ended with a trip to the historically significant Suomenlinna sea fortress (http://www.suomenlinna.fi/en/fortress/history) in the rain.
Nearly 500 delegates gathered over five days to participate in this leading open source, open access and open data repository event featuring multiple content tracks. The volume and diversity of content exchanged this year is evident by scanning the packed agenda (https://www.conftool.com/or2014/sessions.php) of workshops, meetings, panels, main session presentations, 24/7s, a developer challenge and interest groups. The exchange of information around intense collaboration with colleagues from 38 countries is at the heart of what makes this conference so successful. This year’s conference was hosted by the National Library of Finland and the University of Helsinki.
This year the Open Repositories Program Committee is collaborating with F1000 Research, an Open Science journal, to publish a special edition featuring a selection of best papers from OR2014.
Recordings of main conference sessions are available here (please note that these sessions will be unavailable from June 19-23): http://or2014.helsinki.fi/?page_id=985
DuraSpace was an OR2014 conference supporter. Staff and open source projects were front and center highlighting current developments in DSpace and Fedora for diverse audiences during meetings and conference sessions. At the DuraSpace table the organization’s role as a steward of open source projects providing technical leadership, fundraising, administration and marketing and communications services was discussed.
Fedora and DSpace held committer meetings on June 9. These small group meetings served as both a kick-off for updates about the 2014 DuraSpace Membership Campaign that included a review of project-specific benefits as well as software development progress reports and in-depth conversations with committers and community members.
At the DSpace Committers meeting on June 9 repository managers and committers had a productive discussion about the process of identifying features for inclusion in a release. Making sure that valuable community contributions are added to the release version relies on calling attention to development work being done by community members. A new Working Group was established to discuss the best way for developers to announce formative work, for others to express interest, and for committers to track progress and ultimately review it for possible inclusion. This is envisioned as a kind of “developers hub” where the community can both contribute and weigh-in on works-in-progress.
Tim Donohue, Technical lead for the DSpace Project, introduced meeting attendees to the new DSpace governance/membership model (http://dspace.org/governance) to underscore the relationship between becoming a member as a way to gather resources to meet development goals, and participating in governance to contribute to decisions about how to allocate resources for software development. He emphasized that this new sustainability model will only work with broad community participation. He noted that 59 contributors from all over the world had contributed to the release of DSpace 4.0.
Notes from the DSpace Committers meeting are available at: https://wiki.duraspace.org/display/DSPACE/DevMtg+2014-06-09+-+OR14+Meeting
With excitement about the June 5 release of Fedora 4 Beta in the air, the big question repeated many times during OR2014 was, “When will the production version of Fedora 4 be available?” Andrew Woods, Technical Lead for the Fedora Project, answered, “When enough developers volunteer to test Fedora 4.0 Beta the community will have a basis for deciding when it is hardened sufficiently to release as a production version–hopefully in 2014.”
Currently the Fedora 4 team is looking for “greenfield” Fedora 4 beta repository installations. Although the availability of migration paths from Fedora 3 to Fedora 4 would be desirable with the first release of Fedora 4 this functionality is planned for the following release. The group was reminded that migration use cases to inform a pilot project based on institutional needs will allow the Fedora 4 team to stand up data pilots to develop migration paths.
Issues were discussed such as what committing people for testing meant in terms of time and resources, and differences between Fedora 3 and Fedora 4 such as modeling objects with hierarchies and relationships.
Woods explained, “There was a big bucket in Fedora 3. You made relationships between digital objects in the bucket. There is an intrinsic structure to Fedora 4 that you may chose to take advantage of or not to build your objects into a hierarchy.
“Install and exercise the beta, someone else is not going to do it,” said Woods. He agreed to create a few skeleton test scenarios and suggested that it would be better if users created their own test patterns based on how they will use Fedora 4.
Dr. Erin McKiernan offered a view of what open source, open mind and open cooperation around research means to those who have limited access and a dodgy infrastructure to contend with. She opened with a statement and a question: “We need culture change in academia. Sharing is important everywhere but in academia. How we will accomplish this?”
She suggested several strategies. Changing priorities so that proprietary publishing is not the only route to academic excellence. Using networks to highlight OA research, especially through social media, is a viable method. She reminded the audience that lack of access can slow scientific progress towards solving major societal problems like disease eradication.
Her advice to researchers: Make a list of OA Journals in your field; discuss OA with collaborators up front; discuss process with collaborators; document your altmetrics to highlight impact; blog about your work so people outside your field can understand, and; be active in social media to increase visibility—don’t speak in an echo chamber.
This year the conference offered a new, very popular track aimed at giving participants an opportunity to speak out about issues related to repositories that, perhaps, had been bothering them with regard to associated technologies and practices.
Notable Rants included “Digital Scavenging” presented by Linda Newman, University of Cincinnati, who suggested that we should be thinking about preserving entire environments–not just parts and pieces–to become “anthropologists and archaeologists of the future”.
Paul Royster, University of Nebraska-Lincoln, offered “An Alternative American Approach to Repositories”. He suggests that institutions do not have to participate in well-known initiatives such as open access mandates and Creative Commons licenses to be successful. The U Nebraska repository is “the second-largest U.S. institutional repository, with more than 70,000 full-text documents available. We experience 7.5 million hits and furnish 6 million downloads annually. Our content ranks higher than Elsevier’s and Springer’s in Google search results.”
BUILDING SUCCESSFUL, OPEN REPOSITORY SOFTWARE ECOSYSTEMS: TECHNOLOGY AND COMMUNITY
Representatives of open source projects Nick Ruest, Courtney Mumma, Declan Fleming, Michael Giarlo, and Andrew Woods led a panel discussion to look at how the Archivematica, AtoM (Access to Memory), Fedora, Hydra, and Islandora communities collaborate with and learn from challenges, inspire others, and how they engage other communities and organizations who are contributing to a thriving and more diverse repository ecosystem.
Active releases, interest groups, a sustainable community funding model, service providers, “handshake” protocol with other systems were among factors cited in making an ecosystem work.
The panel was asked, “Why are there so many projects and products? Do we need them?”
Andrew Woods, pointed out that not all projects and products are doing the same thing at the same level in the stack. Where there is overlap it’s within the ecosystem and its healthy to have multiple versions of similar kinds of work.
The discussion touched on how to figure out the right amount of product diversity. More effort might be placed on bringing projects together to work on key responsibilities and how to jointly leverage distributed resources. An audit of projects and products could lead to understanding similarities and differences.
Michael Giarlo, University of Pennsylvania, pointed out that there are “not a million of them” (projects and products) and there are thousands of institutions that have adopted them.
Each year developers who attend the conference are asked to consider participating in a “Developer Challenge”. This year the event was sponsored by the Digital Library Federation represented by Rachel Frick. Other prizes were offered by the Fedora Project and ODIN. The aim is “to provide opportunities for growth for our software developers and show our support and appreciation by providing them with an event, the developer challenge, that is fulfilling for them and also valuable for the Open Repositories community.” This year ten entries were answered the challenge to “Build or enhance a repository ecosystem, in line with the conference themes”. Descriptions of the winning entries may be found here.
DSpace presentations were numerous at the conference ranging from open access discussions and presentations to new ways of using DSpace including the Edinburgh and U Auckland “Skylight” which is a configurable front-end featuring faceted search and browse. It “tucks away” a DSpace repository as an administrative tool that writes to Solr index providing one system for many websites.
Several conference sessions were focused around COAR (Confederation of Open Access Repositories), an international consortium with a vision to create a global knowledge infrastructure built on a network of open and digital repositories. Many of those repositories are built with DSpace. The development of a collaborative and internationally cooperative roadmap is moving forward working with repository networks. The COAR Repository Observatory monitors trends and the evolution of the open access repository landscape.
DSpace CRIS allows organizations’ business processes, records, and scholarly data to be dynamically integrated and viewed. Susanna Mornati, Senior Manager for Research Information Solutions, CINECA, explained that DSpace CRIS management functions are accomplished through a simple web interface that allows for management, collection, and access to data about people, organizational units, prizes, project, grants and more.
The new DuraSpace membership model and work towards the release of DSpace 5.0 were also discussed in several sessions. A “wait and see” view was expressed with regard to how community sustainability for DSpace will evolve. UI improvements rank high on the list of features that the community would like to have in the next release of DSpace.
Fedora’s successful and rapid roll-out of Fedora 4 Beta was big news at the conference with many sessions filled with current users anxious to get details about product features and timing for a production release. As tech lead Andrew Woods repeated many times, the timing of a Fedora 4 production release depends on community participation in testing. When the community decides together that the software is “hardened” enough, then the production release will move forward.
A group of “Old China hands” led a Fedora Steering Group panel discussion. This group has helped to triple financial resources and developer contributions for the Fedora 4 project over the last two years which in turn has contributed to the project’s rapid development process. Most of the panelists had been working with Fedora for some time. Matthias Razum, Head of eScience at FIZ Karlsruhe; Tom Cramer, Chief Technology Strategist at Stanford University Libraries; Neil Jeffries, Head of R&D, Bodleian Digital Library at University of Oxford; Andrew Woods, Technical Lead for Fedora, and; Wolfram Horstmann, Director of Goettingen State and University Library, shared their motivations for adopting Fedora.
Matthias Razum began using Fedora to develop a science IR for the Max Planck Institute. He now uses Fedora to help manage research data. As a recent adopter of Fedora Tom Cramer went through the painful process of developing a custom repository at Stanford. He says, “This was a good exercise in why you don’t want to roll your own.” By 2008 Fedora’s data modeling aligned with Stanford’s institutional objectives. Neil Jeffries says that the concept of Fedora was something Oxford liked when they realized that they were looking for a general way to store digital objects. Andrew Woods has been in the preservation space on a number of initiatives and began working directly on Fedora in 2008. He believes in the mission of digital preservation and how that functionality is enabled in Fedora. Wolfram Horstmann tried his first iteration of Fedora in 2003-2004— and it’s still running in the same tech stack. He pointed out that you can use this technology for a long time!
Robin Ruggaber, Library CTO at University of Virginia, explained her understanding of the commitment of an FTE contribution to the Fedora project, how developers are onboarded to the process, how developer participation fits into University of Virginia’s organizational contribution and how direct participation in Fedora 4 development has benefited her staff.
The migration from Fedora 3 to Fedora 4 was on the minds of many attendees. Migration will be tackled in the release after the first production release. Panelists agreed that knowing about various migration use cases will improve the chances that the eventual solution will match particular users’ needs. Participants were encouraged to begin contributing their use cases now. Attendees were encouraged to consider starting a beta “Greenfield” pilot project is as a way to engage with the project right away.
Greater transparent community engagement with the Fedora 4 project “as if we were in a room together” was expressed as an ongoing Fedora project goal.
FEDORA 4 AND LINKED DATA
Among many other capabilities Fedora 4 presents a brand new way to engage in the world of linked data. Capturing patterns for those who go in early will be useful, and capturing vocabularies and common patterns are what will make a difference. During Fedora Interest Group sessions Chris Beer, Stanford University Libraries, demonstrated how Fedora 4 allows the use of data in interoperable ways which makes it “easier to coexist in wider linked data world.”
With Fedora as a back-end, Islandora, Hydra, Hydramata and Avalon have emerged as significant platforms for open repositories that are individualized for specific institutional needs with technical standardization across user interface functionality and ease of use.
Many conference sessions focused on both technical advancements, community support and examples of how these platforms are being implemented at institutions and organizations.
Looking forward to the Tenth International Conference on Open Repositories (OR2015) Jon Dunn, Indiana University Libraries, and Sarah Shreeves, University of Illinois Urbana-Champaign Library, concluded the conference by offering several reasons why next year’s conference in Indianapolis, IN (known for car racing, several famous authors and more) will be one of the best. Save the dates, June 8-11, 2015, for OR2015. The conference will be jointly hosted by Indiana University Libraries, University of Illinois Urbana-Champaign Library, and Virginia Tech University Libraries.