[Update 2014-09-11: fixed some grammatical misalignments]
Avid readers of this blog will have noticed that we’ve suddenly started talking about DORA quite a lot. So who, why and/or what is a DORA?
The simple answer is that it is a Digital Object Repository for Academe, which tells you that we came up with a snappy acronym but perhaps leaves you wanting a little more information.
As part of our mission of supporting (? encouraging, enabling, proselytizing, enforcing, …) eResearch at UWS, we’ve taken a step back from the coalface and tried to paint The Big PictureTM about what systems to support eResearch should look like. One output of this was the set of principles and practices, and another is a high-level architecture of an eResearch system, which looks like this:
A Little Mora1
The basic idea is that a DORA provides a good place on which to store research data while researchers are working on it. To this end there are a few key features any potential DORA must support, some of which come directly from our principles, but some of which are more process oriented. A DORA must:
- safely store research data and its associated metadata in a way which keeps them linked. Conceptually there is a combined object in the DORA which contains both the data and metadata (No Data Without Metadata)
- allow versioning of these combined data objects
- allow search for data objects
- support the Upload and Researcher APIs to allow scripted operations:
- Upload API allows automated upload of research data and associated metadata
- Researcher API allows searching for and downloading of object, and then uploading of modified or processed versions of them
- support the Publisher API to allow a clean transfer of information on data objects to an institutional data catalogue, as well as a possible transfer of the the data to another repository (depending on the nature of the data, the repositories and an institution’s policies)
In an ideal world, there would be one DORA which would do everything for everyone, but honestly that seems so unlikely that we have to acknowledge that we will end up with a small number of DORAe and for any given research project we will pick the most appropriate one. This is another place where the APIs come in – if all DORAe support the same APIs then they become drop-in functional replacements for each other. Additionally, behind these APIs there could be a small ecosystem of cooperating tools – a simple repository for storing, an indexer for searching, a preview generator, etc – further reducing the need to find One Perfect Tool which Does Everything Brilliantly. (Separation of Concerns)
The catch here is, of course, that it is unlikely that two different potential DORAe will come out of the box supporting exactly the same APIs, so there’s a good chance that we will have to write some code to adapt the out-of-the-box API to the generic one we design. One possible light in this particular darkness is how much we can use something like Sword 2 as an API.
DORA and AAAA
So how does a DORA work with our AAAA data management methodology? To our great relief, pretty well:
- Acquire
- the getting of the data and metadata in the first place. It’s not really shown on it but essentially the output of the acquisition are the data and metadata on the filesystem at the bottom of the diagram.
- Archive
- the combining of the data and metadata and the uploading of it into the DORA, via the Upload API.
- Act
- the stuff a researcher does to the data. The data is fetched via the Researcher API and updated versions are written to the DORA also via the Researcher API. This is where the versioning capability of DORA comes into play.
- Advertise
- information about the data is packaged up and transferred into the Institutional Data Catalogue. Optionally, the data described in the catalogue may be transferred to the Institutional Data Store.
As you can see, a DORA sits at the heart of this, and is pretty key to making it all work, which is why we might start to seem as if we’re banging on about DORAe rather.
Who is DORA? by David Clarke is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
I should take a moment to apologise for having opened this particular PanDORA’s Box of punnery.
Sorry↩