UWS eResearch in Research Cloud Land

[Parts of this post appeared on Peter Sefton’s blog last week – go there for some more detail about the demonstration discussed here]

UWS eResearch at the NeCTAR Research Cloud day

Last week NeCTAR and the Australian National Data Service hosted a Research Cloud day in Sydney designed to show researchers and research-support units like the eResearch team at UWS how to use the free virtual servers provided by NeCTAR. Every Australian researcher with a login to the Australian Access Federation (AAF) should have access to the service and is entitled to a couple of months of server up-time on two basic machines with no application process. Various flavours of Linux are available, no Windows. If the service suits, then you can apply via an online form for more (free) time. These machines are good for running analysis or modeling process, can be configured to run desktop software via a remote desktop, and are useful for development work. (NOTE: when I say you should have access, you may not. If you can’t log in using your institutional credentials contact your local IT helpdesk and if you like, send me an email, I will let the NeCTAR people know so they can track these problems).

First, what is NeCTAR? (National eResearch Collaboration Tools and Resources).This is what it says on their ‘about’ page:

NeCTAR is an Australian Government project to build new infrastructure specifically for the needs of Australian researchers.

In this era of digital connectivity, NeCTAR is using existing and new information and communications technologies to create new digital efficiencies specifically for the needs of Australian researchers.

Australian researchers and their technical partners will drive the design of what NeCTAR will look like. NeCTAR is building:

NeCTAR is collaborating with a broad mix of international and national technology partners and research disciplines. From scientists to historians, archaeologists, software engineers and arts disciplines.

NeCTAR is a $47 million dollar, Australian Government project, conducted as part of the Super Science initiative and financed by the Education Investment Fund. The University of Melbourne is the lead agent, chosen by the Commonwealth Government.

There were a number of UWS people in attendance at the day apart from us, including researchers, a PhD student and one of the developers from the library systems team – we’ll be in contact with each of you to get your impressions of the day and to work out how to help UWS exploit the service.

Andrew Leahy and I (Peter Sefton) ran a workshop at the Research Cloud day, to help people get familiar with the process of starting up and connecting to virtual machines, and to explore some of the possibilities of the cloud. We put together a demonstration that showed how multiple cloud servers could be used in a chain to pull data from the Research Data Australia (RDA) portal, index it in another catalogue application, ReDBox, extract and re-index the geo-spatial data that’s in RDA and hook that up to Google Earth. The result is a demonstration app that lets you browse around Earth and see which areas have data.

Here’s a picture of all the senseis, ie workshop leaders, on the day – we’re the two furthest to the right. At far left is David Flanders from ANDS who has put a huge amount of energy into building strong communities for technical people working in eResearch both in the UK and now here in Australia.

Figure 1 The ‘senseis’ explaining what they’re going to be working on in their ‘dojos’. Photo by Steven Manos of Melbourne uni.

The UWS demo data browser

In our session we covered some of the basics of how to run servers in the Research Cloud, then focused on getting a particular application installed and running. To finish, we gave a brief walk-through of our demo.

Figure 2 The dojo demo. Photo by Steven Manos of Melbourne uni.

We have now installed the geo-index at UWS so you can try this out if you have Google Earth by downloading a KML file. This is a demo service only – let us know how you go.

We will work some more on this little very part-time project, but here’s a snapshot of where we are at the moment with our mission of getting RDA data into Google Earth.

Figure 3 Screenshot of data sets from Research Data Australia mapped onto the earth (visualized via their bounding boxes from the data collection metadata) from a long way up (the sets with the (claimed) broadest geographical coverage show up first, more appear as you zoom in)

Right now the experience is basic, but engaging, just using the geometry provided in RDA collections and without changing any of the default ANDS styles you can see a ‘heat map’ of where data have been collected.

Figure 4 Zooming in on research hotspots like The Reef shows how crowded it is there – we’re limiting this to 40 boxes or points per screen, so keep drilling to find more

Figure 5 You can click on a marker to pull up a description

Why?

Why are we doing this?

  • To test out the NeCTAR Research Cloud, so we can advise researchers at UWS on how and when to use it, and integrate it into our UWS Research Computing strategy – we’ll cover this in future posts and publish advice on the UWS eResearch site. Initial observations include:

    • It’s good for testing out software ideas quickly – you can fire up a tool, pull in data from a networked source and do something to it, but some planning is required to package the tools and get them to operate as turnkey appliances. At the moment you need command-line skills and patience.

    • This infrastructure is a game-changer – it can take months for university IT departments to deliver a virtual machine to a research or dev group and policy prevents many of us from buying commercial cloud services on corporate credit cards.

    • The lack of easy-to access persistent cloud storage which had not yet come on line from NeCTAR’s sister project Research Data Storage Infrastructure RSDSI is a bit of a problem, there is no simple way to mount a device with all your data on it.

    •  A specific lesson from our demo: being able to fire up servers on demand to handle peak loads, such as building a big geo-spatial index (we only built a tiny one) seems promising but we’d like to see this made as easy as possible for people who want to get research done rather than mess with computers.

  • To see how Andrew Leahy’s considerable expertise in practical, easy to deploy low-end visualization combined with my experience in repositories, harvesting metadata etc might be applied to resource discovery. Being able to fly around the landscape means you can follow geographical features and maybe discover useful data sets along the way.

  • To explore how the ReDBOX Research Data Catalogue application might be able to geo-index collections (see notes on this below).

It turns out there is a competition running for the best demonstration of data re-use using the cloud. The work that Andrew and I have started looks like the nucleus of an entry. We’d welcome others from UWS or beyond to join us particularly those with geo-located data we could add to the ANDS data set, or people with good ideas about information discovery and/or coding skills. Drop us a line if you’d like to join us in extending this demo.  

For a longer discussion of the demo see Peter Sefton’s post.

Copyright Peter Sefton 2012. Licensed under Creative Commons Attribution-Share Alike 2.5 Australia. <http://creativecommons.org/licenses/by-sa/2.5/au/>