Cloud Forecast: Report from JISC Curation in the Cloud Workshop

Wed, 2012-03-21 10:21 -- carol

Ithaca, NY During the JISC Curation in the Cloud Workshop (http://www.jisc.ac.uk/events/2012/03/curationinthecloud.aspx) held in London March 7-8, 2012 attendees considered the potential role and benefit cloud computing could play in curation tasks and strategies. The workshop focus was on learning more about what purpose the cloud serves in curation, and discussions around how to generate and analyze data in the cloud. A draft whitepaper, Digital Curation in the Cloud authored by Brian Aitken, Patrick McCann, Andrew McHugh, and Kerry Miller, reviewed service models, provider models and the general benefits of utilizing cloud computing along with an overview of the risks. The draft white paper will be revised based on feedback from workshop attendees and later distributed.

The whitepaper response was offered by David Wallom who presented the difficulties around calculating and comparing cloud costs and legal liability as impediments to moving forward. While standardized accounting information from cloud providers would go a long way towards making it simpler for institutions to gauge costs and benefits, questions remain about how to meet legal obligations.

The draft whitepaper points to "...loss of data or service level, legal and governance incompatibilities and transfer bottlenecks" as risks associated with taking advantage of cloud economies of scale and rapid provision of resources for digital curation tasks. The authors expand on legal issues with regard to cloud adoption: "Institutions cannot avoid any of their legal duties by using cloud services; they will simply have to find other ways to fulfill them."

Kris Carpenter, Internet Archive (IA), foresees a time in the future when hardware is completely outsourced while current practice is focused on maintaining control of content and implementing a hybrid cloud storage model. Right now more than 7 petabytes of IA data are publicly accessible and multiple PBs are not public. IAs stores derivative data in the cloud while primary data is mostly stored locally.

The Kindura Project (http://kindura.cerch.kcl.ac.uk/http://kindura.cerch.kcl.ac.uk/?page_id=2) came about to meet the needs of researchers who are now required to keep research results for a number of years. Using a custom interface for DuraCloud configured as a hybrid cloud over Amazon, Rackspace, plus iRODS they have come up with a solution interacting with a wide range of researchers that includes a costing engine based on tracking and user input. One of the main objectives of the Kindura effort has been to come up with a way to effectively and semi-automatically manage the underlying cloud storage decisions based on content requirements and it usage profile.

Andrew Woods, DuraSpace technical lead and developer for DuraCloud (http://DuraCloud.org), presented an overview of DuraCloud both as a managed service and as open source software. The focus of the talk was on lessons learned over the span of three years of implementing a cloud-based preservation platform. See the presentation for more details (download).

In break-out sessions attendees came up with a forecast for curation in the cloud that included a desire for more transparency with regard to costs, SLAs and energy-efficiency; standards for interoperability and portability; breaking out of "silo mentality"; university workflows that incorporate cloud options; vendors offering "cloud appliances"; certification of cloud services; third-party auditing of cloud services, and; greater availability of high-profile case-studies.

Jonathan Markow, CSO, DuraSpace offered case studies on how non-profit open source projects focus their efforts to sustain operations by emphasizing marketing and communications and community capacity-building activities and programs.

While cloud solutions offered by DuraCloud and EDUSERV (http://www.eduserv.org.uk/) are being adopted as viable options for adding cloud capabilities to institutional technology stacks, the construction and architecture communities are diving into the cloud. Researchers are analyzing scientific workflows with the cloud in mind, and institutions are planning for digital preservation with the assumption that institutions and organizations will likely outlive cloud providers.

With thanks to Andrew Woods and Jonathan Markow for help in preparing this report.