ANDS Uber Dojo

“Develop something researchers will find Cool. You’ve got 3 hours. Begin!” were the instructions barked by Dave Flanders from ANDS.

I turn to my esteemed colleague, Dr Peter Sefton, “I thought you said this was going to be fun?”.

“It is fun, you just don’t know it yet”, he smiled.

Peter and I were UWS representatives at the “Uber Dojo: Advanced Black Belt Event for Tools & Data in the Cloud”. An ANDS event that invited 40 of Australia’s proven research cloud developers into a glass-lined workshop for 2 days of skill-sharing and development hacking on the Aussie Research Cloud.

But hang on, Uber what? Well…

Since Dave arrived on the scene the ANDS developer gatherings have taken a step away from Revenge of the Nerds towards The Karate Kid (the 1984 original, not the atrocious sequels). With lead developers labelled “sensei”, and receiving “dan stripes” based on their ability to deliver solutions. Small group training sessions are called “dojo’s” where participants take turns as Hands and Brains. Hands is the only person allowed to use the keyboard, and Brains wrangles the group and directs the Hands. Meanwhile the senseis lead the group by posing questions, Yoda-style.

Now back to David’s Challenge, “Develop something researchers will find Cool.”

Peter and I were initially going to re-implement our whizz-bang Research Data Australia real-time exploration tool using some of the new techniques we’d learned on Day 1. But after getting a glimpse of super quick VM deployment during our session on Chef with Steve Androulakis (Monash) and Tim Dettrick (UQ). We decided to work on an idea that Peter had been pondering with researchers at UWS for reproducible research.

In a nutshell, the idea was to bring together three things –

1. A dataset
2. Code or toolset
3. System configuration (something which will run 2. using 1.)

Bundle that together to allow a researcher to run up a short-term virtual research environment on the Nectar Cloud. Where they could do some work – eg. confirm output or modify the code. When finished have the components placed safely back in their respective repositories and the VM instance shutdown.

For our example use-case, the 3 components were: 1. forest-based climate data from our good friends doing climate-change experiments at Hawkesbury Institute for the Environment. 2. This data is manipulated and presented using R. And 3. we programmatically create a Linux VM that implemented R-Studio, loads the data + code and presents them to the researcher.

You can read much more about this in Peter’s blog post.

30 minutes before pens-down we had one successful end-to-end run under our belt. But, like all good tech demos, I managed to botch the Apache permissions on my laptop which stopped us from demo’ing the entire shoe-string & boot-lace apparatus. Probably not a bad thing, without snapshots, the Linux R-build takes over 5 minutes to complete. For a couple of old hands hacking on brand-new tools we probably did okay.

What did I take away?

A brand new appreciation of ephemeral (aka cloud) computing resources. To date I’ve been treating the Nectar VM’s much like our institutional VM’s – as a precious resource to be curated and managed. At UWS creating a VM typically take a couple of weeks from inception to login prompt. This means we build persistent long term server-like solutions. Which have long-term overhead of patching, maintaining, securing and sysadmin’ing.

Having a computing environment that appears when needed to do a specific piece of work and then goes away, is a huge change. I just need to concentrate on the 3 critical components – the data, the code, the instructions to build the environment. I no longer have to be concerned about administrating lot’s of long-term computing environments.

Unfortunately, our current UWS processes aren’t geared to anything besides long-running persistent VM’s. During the dojo’s challenge sessions we probably created and destroyed more VM’s than I’ve submitted server-deploy-requests for in the last 3 years of eResearch at UWS. This was completely mind boggling for me. And means a re-think about how we plan for research computing.

What makes all this possible is being able to spin up VM’s from an API and commission software to the host systems using automated tools like Puppet and Chef. Nectar Research Cloud implements Open Stack and we used the EC2 API with Python-Boto to programmatically create VM’s to our specification.

I also have a new appreciation of modern coding environments in Python and Ruby. And we need more skills & training in this area.

So, was Peter right? Did it turn out to be fun?

Most definitely.

Looking forward to the developers challenge and hackfest at the eResearch Australasia conference Sunday 28 Oct – Wednesday 31 Oct.