Trip report: Peter Sefton @ Open Repositories 2014, Helsinki, Finland by Peter Sefton is licensed under a Creative Commons Attribution 4.0 International License.
From June 9th-13
th
I attended the Open Repositories
conference way up North in Helsinki. This year I was not only on the
main committee for the conference, but was part of a new extension to
the Program Committee, overseeing the Developer Challenge event,
which has been part of the conference since
OR2008
in Southampton
. I think the dev challenge went reasonably well,
but probably requires a re-think for future conferences, more on that
below.
In this too-long-you-probably-won’t read post I’ll run through a
few highlights around the conference theme, the keynote and the dev
event.
Summary:
For me the take-away was that now we have a
repository ecosystem developing, and the OR catchment extends further
and further beyond the library,
sustainability is the big issue
,
and conversations around sustainability of research data repositories
in particular are going to be key to the next few iterations of this
conference. Sustainability might make a good theme or sub-theme.
Related to sustainability is risk; how do we reduce the risk of the
data equivalent of the
serials
crisis
if there is such a crisis it won’t look the same, so how
we will stop it?
View from the conference dinner
Keynote
The keynote this time was excellent. Neuroscientist
Erin
McKiernan
from Mexico gave an impassioned and informed view of
the importance of Open Access:
Culture
change in academia: Making sharing the new norm
(McKiernan,
2014). Working in Latin America McKiernan could talk first-hand about
how the scholarly communications system we have now disadvantages all
but the wealthiest countries.
There was a brief flurry of controversy on Twitter over a question I
asked about the risks associated with commercially owned parts of the
scholarly infrastructure and how we can manage those risks. I did
state that I thought that Figshare was owned by McMillan’s Digital
Science, but was corrected by Mark Hahnel; Digital Science is an
investor, so I guess “it is one of the owners” rather than
“owns”. Anyway, my question was misheard as something along the
lines of “How can you love Figshare so much when you hate Nature
and they’re owned by the same company”. That’s not what I meant
to say, but before I try to make my point again in a more considered
way, some context.
McKiernan had shown a slide like this:
My
pledge to be open
-
I will not edit, review, or work for
closed access journals.
-
I will blog my work and post preprints,
when possible.
-
I will publish only in open access
journals.
-
I will not publish in Cell, Nature, or
Science.
-
I will pull my name off a paper if
coauthors refuse to be open.
If I am going to ‘make it’ in science,
it has to be on terms I can live with.
Good stuff! If everyone did this, the Scholarly Communications
process would be forced to rationalize itself much more quickly than
is currently happening and we could skip the endless debates about
the “Green Road” and the “Gold Road” and the “Fools Gold
Road”. It’s tragic we’re still debating in this using this
weird colour-coded-speak
twenty
years in to the O
A
movement
.
Anyway, note the mention of
Nature
.
What I was
trying
to ask was: How can we make sure that
McKiernan doesn’t find herself, in twenty years time, with a slide
that says:
“I will not put my data in Figshare”.
That is, how do we make sure we don’t make the same mistake we made
with scholarly publishing? You know, where academics write and review
articles, often give up copyright in the publishing process, and
collectively we end up paying way over the odds for a toxic mixture
of rental subscriptions and author-pays open-access, with some risk
the publisher will ‘forget’ to make stuff open.
I don’t have any particular problem with Figshare as it is now, in
fact I’m promoting its use at my University, and working with the
team here on being able to post data to it from our
Cr8it
data publishing app
. All I’m saying is that we must remain
vigilant. The publishing industry has managed to transform itself
under our noses from:
much needed distribution service of tangible
goods
; to
rental service where we get access to The Literature
pretty-much only if we keep paying
; to its new position as
The
custodian of The Literature for All Time
, usurping libraries as
the place we keep our stuff.
We need to make sure that the appealing free puppy offered by the
friendly people at Figshare doesn’t grow into a vicious dog that
mauls our children or eats up the research budget.
So, remember, Figshare is not just for Christmas.
Disclosure: After the keynote, I was invited to an excellent Thai
dinner by the Figshare team, along with Erin and a couple of other
conference-goers. Thanks for the Salmon and the wine, Mark and the
Figshare investors. I also snaffled a few T-Shirts from a later event
(
Disruption
In The Publishing Industry: Digital, Analytics & The Future
)
to give to people back home.
Figshare founder and CEO
Mark
Hahnel
(right) and product manager
Chris
George
hanging out at the conference dinner
Conference Theme, leading to discussions about sustainability
The conference theme was
Towards
Repository Ecosystems
.
Repository systems are but one part of the ecosystem in 21st century
research, and it is increasingly clear that no single repository will
serve as the sole resource for its community. How can repositories
best be positioned to offer complementary services in a network
that includes research data management systems, institutional and
discipline repositories, publishers, and the open Web? When
should service providers build to fill identified niches, and where
should they connect with related services? How might these
networks offer services to support organizations that lack the
resources to build their own, or researchers seeking to optimize
their domain workflows?
Even if I say so myself, the
presentation
I delivered
for the
Alveo
project (co-authored with others on the team) was highly
theme-appropriate; it was all about researcher-needs driving the
creation of a repository service as the hub of a Virtual Research
Environment, where the repository part is important but it’s not
the
whole point
.
I had trouble getting to see many papers, given the dev-wrangling,
but there was definitely a lot of eco-system-ish work going on, as
reported
by Jon Dunn
:
Many sessions addressed how digital repositories can fit into a
larger ecosystem of research and digital information. A
panel
on ORCID implementation experiences
showed how this
technology could be used to tie publications and data in repositories
to institutional identity and access management systems, researcher
profiles, current research information systems, and dissertation
submission workflows; similar discussions took place around DOIs and
other identifiers. Other sessions addressed the role of institutional
repositories beyond traditional research outputs to address
needs
in teaching and learning and administrative settings
and
issues
of interoperability and aggregation among content in multiple
repositories and other systems
.
One session I did catch (and not just ‘cos I was chairing it) had a
presentation by Adam Field and Patrick McSweeney on
Micro
data repositories: increasing the value of research on the web
(Field and McSweeney, 2014). This has direct application to what we
need to do in eResearch, Adam reported on their experience setting up
bespoke repository systems for individual research projects, with a
key ingredient missing in a lot of such systems; maintenance and
support from central IT. We’re trying to do something similar at
the University of Western Sydney, replicating the success of a
working-data repository at one of our institutes (
reported
at OR2013
) across the rest of the university, I’ll talk more to
Adam and Patrick about this.
For me the most important conversation at the conference was around
sustainability. We are seeing more research-oriented repositories and
Virtual Research Environments like Alveo, and it’s not always clear
how these are to be maintained and sustained.
Way back, when OR was mainly about Institutional Publications
Repositories (simply called Institutional Repositories, or IRs) we
didn’t worry so much about this; the IR typically lived in The
Library, the IR was full of documents and The Library already had a
mission to keep documents. Therefore the Library can look after the
IR. Simple.
But as we move into a world of data repository services there are new
challenges:
-
Data collections are usually bigger than PDF files, many orders of
magnitude bigger in fact making it much more of an issue to say
“we’ll commit to maintaining this ever-growing pile of data”:
-
“There’s no I in data repostory (sic)” – i.e. many data
repositories are cross-institutional which means that there is no
single institution to sustain a repository and collaboration
agreements are needed. This is much, much more complicated that a
single library saying “We’ll look after that”.
And as noted above, there are commercial entities like Figshare and
Digital Science realizing that they can place themselves right in the
centre of this new data-economy. I assume they’re thinking about
how to make their paid services an indispensible part of doing
research, in the way that journal subscriptions and citation metrics
services are, never mind the conflict of interest inherent in the
same organization running both.
Some libraries are stepping up and offering data services, for
example, work between large US libraries.
The dinner venue
The developer challenge
This year we had a decent range of entries for the dev challenge,
after a fair bit of tweeting and some friendly matchmaking by yours
truly. This is the third time we’ve run the thing a clearly
articulated
set
of values about what we’re trying to achieve
.
All the entrants are listed here, with the winners noted in-line. I
won’t repeat them all here, but wanted to comment on a couple.
The
people’s choice winner
was a collaboration between a
person with an idea, Kara Van Malssen from AV Preserve in NY, and a
developer from the University of Queensland, Cameron Green, to
build
a tool to check up on the (surprisingly) varied results given by
video characterization software
. This team personified the goals
of the challenge, creating a new network, while scratching an itch,
and impressing the conference-goers who gathered with beer and cider
to watch the spectacle of ten five-minute pitches.
My personal favorite came from an idea that I pitched (see the
ideas
page
) was the Fill My List framework, which is a start on the
idea of a ‘
Universal
Linked Data metadata lookup/autocomplete
’. We’re actually
picking up this code and using it at UWS. So while the goal of the
challenge is not to get free software development for the organizers
that happened in this case (yes, this conflict of interest was
declared at the judging table). Again this
was a cross-institutional team (some of whom had
worked together and some of whom had not). It was nice that two of
the participants, Claire Knowles of Edinburgh and Kim Shepard of
Auckland Uni were able to attend a
later
event on my trip at a hackfest in Edinburgh
. There’s a
github
page with links to demos
.
But, there’s a problem. The challenge seems to be increasingly hard
work to run, with fewer entries arising spontaneously at recent
events. I talked this over with members of the committee and others.
There seem to be a range of factors:
-
The conference may just be more interesting to a developer audience
than it used to be. Earlier iterations had a lot more content in the
main sessions about ‘what is a(n) (institutional) repository’
and ‘how do I promote my repository and recruit content’ whereas
now we see quite detailed technical stuff more often.
-
Developers are often heavily involved in the pre-conference
workshops leaving no time to attend a hack day to kick of the
conference.
-
Travel budgets are tighter so if developers do end up being the ones
sent they’re expected to pay attention and take notes.
I’m going to be a lot less involved in the OR committee etc next
year, as I will be focusing on helping out with
Digital
Humanities 2015
at UWS. I’m looking forward to seeing what
happens next in the evolution of the developer stream at the OR
conference. At least it’s not a clash.
The Open Repositories Conference (OR2015) will take place in
Indianapolis, Indiana, USA at the Hyatt Regency from June 8-11, 2015.
The conference is being jointly hosted by
Indiana
University Libraries
,
University
of Illinois Urbana-Champaign Library
,
and
Virginia
Tech University Libraries
.
This pic got a few retweets
References
Field, A., and McSweeney, P. (2014).
Micro data repositories: increasing the value of research on the web.
http://eprints.soton.ac.uk/364266/.
McKiernan, E. (2014). Culture change in
academia: Making sharing the new norm.
http://figshare.com/articles/Culture_change_in_academia_Making_sharing_the_new_norm_/1053008.