A new era for open source! Free, open and EASILY REUSABLE software for Academia
This is a quick post to celebrate what some may feel is just a ‘small victory’, but in fact this small battle won is the first in the larger War of sustainable software! Let me explain: last night, at about midnight (Melbourne time) @spetnatz got the first ‘public image’ of MyTardis working in NeCTAR’s OpenStack cloud. <– That sentence might have meant nothing to you, but let me explain human terms why this could mean more reusable software for us all.
What problem did these small victory tweets *really* solve? In a sentence, they announced that “the final barrier of reusing open source software has been removed”. Academia projects worldwide produce software: small and big projects produce code that meet the needs of different Academic users (and often those users are a very small group of scientists scattered around the globe with a very particular problem); the problem isn’t if the software actually meets their needs (most of the time it does). The problem is that these small (or large) groups of Academics can’t get the software to work (despite it sitting in an open source code repository).
The scenario goes something like this:
An Researcher at a conference: “Oh wow, that looks like cool software that would be perfect for __[my niche subject or topic area]__.” The Academic goes back to their home institution very excited that such a specific piece of software exists, and finds the nearest developer to have a look at said cool software and see if they can get their own version working.
The developer finding that the software is in an open source repository like Code.Google or Github (thanks to funder mandates) attempts to get the right ‘environment variables‘ in place to get the software to work, e.g. Operating System, Database, right version of the Java <ugh!>, etc. It is here, in trying to get the “environment variables” (aka dependencies) to work that presents the single most significant hurdle to reusing more open source software.
The problem is that most project in Academia run the project marathon but then don’t run the last mile; that is, they get the code to work on their own box but they don’t take the time to allow anyone to get it working with the click of a button. Having more *easily reusable* software where the developer (or even better, the researcher) doesn’t have to go and figure out what dependencies in the form of ‘environment variables’ is essential if the project wants its software reused by others. Despite trying and trying, the failure rate for getting someone else’s OSS code to work is significant <– I used to think this was just me back in the day, until I conducted the following experiment where I had the lead developers of each publications repository systems (ePrints, DSpace & FedoraCommons) try and launch each others repositories: they all failed despite trying to help one another for over two hours.
The point is simple, funders (ANDS, NeCTAR, RDSI, JISC, EU, Mellon, NSF, etc) have done a fantastic job with making sure that projects who produce software MUST publish it as Open Source, we now must run the final mile of RE-USABILITY and REQUIRE that the software is not only Open Source, but that a Virtual Machine Image is made * publicly* available via the likes of OpenStack and/or Amazon (or both since they use the same APIs!) so that anyone can EASILY RE-USE it.
To take it one step further (and remove another small but significant hurdle), funders need to provide OpenStack cloud platforms that the projects can then leave their VMs with for the long term after their funding ends. This is where software will start to become long term reusable infrastructure!!! <– this has the potential to not only solve the re-usability problem but also the SUSTAINABILITY problem. Quite simply, we could start to address the sustainability problem if the funder dedicated themselves to making sure that they not only keep the ‘Virtual Machine’ in the long term for all their their funded projects, but then make sure to keep the ‘Environment Variables’ around so we can find a machine in the Cloud which can ‘spin it up’ for actual FREE REUSE (one of the biggest problems with Academia project is that they are just ahead of their time and need to wait for the users to catch up!).
In summary, I am a massive advocate and participant in the Open Source movement (especially in Academia, which is where its true home is), but we have had a major flaw over the years: which is, that we have not ran the final mile of the marathon! We produce the code and getting it working once (throwing it into a code repository), but we don’t then put it into a Virtual Machine that will guarantee that *anyone* can come along and launch the thing without having to be a developer (this is the problem that is the ‘Open Source Code Repository’).
Furthermore, we must make sure to remove the hurdle of having the need for a credit card to launch the software on the Cloud, e.g. Amazon, et al. By making sure government funds OpenStack instances for Academia we get the guarantee of software sustainability Beyond Life Of Project (BLOP!). And better yet we get the guarantee that it will be free for the individual (poor) Academic to have a go (notice, I’m just saying ‘have a go’, as I completely support the idea that if the software needs to be used by thousands then it should be moved over to the likes of Amazon because they can scale better than an Academic Cloud can, right now). Which of course, now that we have solved the Open Source “EASILY REUSABLE” problem we need to look forward to the next big challenge – A Global Academic Cloud (“Mind the GAC”) <– you heard it here first 😉
Well done Steve, you rock (as usual). Congratulation on being the first FREE REUSABLE OPEN SOURCE software system in the Cloud (Worldwide from the looks of it <– that is pretty damn cool)!!! It is an achievement I’ve been waiting for almost five years now since we did the #Fedorazon project. Great work.
Also, the unsung technical support heroes who worked with @spetnatz late into the night to make it happen: Clint, Sean and Steve M. <– You guys rock and it is a privelage to see watch what you are trying to achieve. Not least, thanks to Glen being bold and shielding us hackers from the political hurdles that could quickly stop all this bottom-up innovaiton occurring.
Also a quick disclaimer: be patient with OpenStack right now, it is one of the biggest open source projects in the world and so there are lots of bear traps waiting to clamp onto your brain and drag you down. Ask for help, as we need to do this together in Academia, we are ALL responsible for making the Cloud work (this isn’t just NeCTAR’s job), it is something we all want as developers, so lets make it happen and roll with the punches.
If you want to get trained up on how to use OpenStack check out Developer Dojos we are putting on #nadojo
= By saying “most” projects meet the user need, I’m being a little bit ‘tongue in cheek’, there is still a very real problem around how projects build in usability testing methods to assure that is does actually meet their user needs.
= I’ve personally looked over 200+ Academic project code repositories over the past five years and I can say that (roughly) only >20% of them have I got working by personally compiling the code. In contacting the developer (if they are still around) I can slightly jump that number to 30-40%. Otherwise, the biggest hurdle for re-use is quite simply just getting the code to compile and spin-up.