Cloud Buzzword Glossary
I’ve been working on the final report for Fedorazon the past week, aka thoughts on the past year’s experience of running Repo in the Cloud. And in hopes that my hard work doesn’t just end up summarised as a couple of bullet points in a JISC committee meeting I’m trying to write it as if it were a series of blog posts. I began with a ‘Glossary of Cloud Buzzwords’ (below). I would of course be very grateful for feedback, feel free to twitter me if you want some dialogue: http://twitter.com/dfflanders
Also it appears andypowe11 was thinking along the same lines in regards to the current state of buzz words trying to define the ‘The Cloud’; likewise, my favorite buzz-spin includes this mindmap from BAE systems (who should be working as ontologists from the looks of it). I would agree with Andy that there needs to be some kind of consensus so we don’t spiral out of control </toLate>; accordingly what follows is my dumb-downed version of the nomenclature of the Cloud.
The following are terms that have begun to emerge as ways of defining the cloud. The terms are in sequential logical order (rather than alphabetical), i.e. to understand any given term you’ll need to understand the term that precedes it.
- The Cloud: is ‘the Internet’, or ‘the Grid’, or ‘Utility Computing’ or ‘the Information Superhighway’, or any other buzz word you can think of for the latest re-branding of ‘the Web’ (if you are really cool call it ‘Web 3.0’). However, this time around ‘the Cloud’ is associated with very large data centres that can provide computing resources on demand for granular costing (keep your meme cache on for some kind of ‘green cloud’ coming soon). What the Cloud really means is less financial responsibility for the developer who is usually thinking about code and not financial predictions or growth rates. In short, as your data or users increase the number of server boxes you need to keep the users clicks reacting fast can be purchased as you grow. There are other advantages to the Cloud but for the sake of bluntness ‘the Cloud’ in this report is a way for developers to get rid of the responsibility and time needed for predicting growth, aka “the cloud means no VC!”. And please keep in mind that ‘growth’ is key to what repositories must do if they are to be successful in the Cloud.
- The Stack: (see diagram) a stack is made up of various computing components which provide different functionality. Typically a stack will include (but is not limited to): Operating System (OS) + Web Server (WS) + Database (DB) + Software (SW) = base components for a service. Common platform stacks that are put onto “blank servers” are things like the LAMP stack: Linux OS, Apache WS, MySQL DB and a Programming Language (usually Perl, Python or PHP) based SW. The various kinds of components in a stack will define what kind of service it is.
- Service: for this report, a ‘service’ will mean the jobs that can be done to your data upon the web. It accomplishes this by having various components in the stack that perform tasks upon your data. Genres of services include numerous buzz terms with “aa” in the middle of them: HaaS, IaaS, CaaS, etc**. However the purpose of this report, there are only two genres of services that repositories need to concern themselves with: SaaS and PaaS.
- PaaS: “Platform as a Service” is comparable to renting servers from large data centres. The “rented server” is yours to do with as you please so you can put any stack you want on the platform. In the case of Fedorazon we have used a LAMP like stack but with the software (SW) component being FedoraCommon’s Java software base. Currently we would define examples of PaaS to include: AWS’s EC2, IBM’s Blue Cloud and perhaps even GoogleApp Engine once it has more SW stack component options.
- SaaS: “Software as a Service” refers to specialised companies who provide a very specific stack and support for that stack so as to get rid of the need for any technical headaches, e.g. where the cloud gets rid of the need for predicting growth rates (aka server purchases), SaaS gets rid of the need to maintain, update and support the specific piece of software you are running. While it does not get rid of the need for the local IT department, it does get rid of the need to call your local IT department anytime you want an update done or a bug fixed. In terms of the stack, SaaS offers technical support for all the component layers beneath it. Accordingly, In the case of Fedorazon, we have preconfigured a PaaS stack so anyone who wanted to provide the human stack component layer atop could call themselves a repository SaaS provider. Examples of SaaS: GDocs, WordPress.com, etc.
- SLA: a “Service Level Agreement” is the policy that a Cloud providing company will make to its clients to guarantee how their services will work. In the case of the various Cloud services (SaaS, PaaS, etc) the SLA will be different depending upon what you are paying for, e.g. Amazon Web Services guarantees a 99.9% uptime SLA or will compensate for time down.
- Hard Disk Storage (HDS) or Hard Drives (HD): the storage componet of Cloud computing requires that data be online for real-time access by remote users. This requires lots of Hard Drives and if you are going to provide 24/7 access to the data on those hardrives you usually need to put the data in a couple of places to assure if one does down the other is there to back it up. For repositories where preservation is a leading concern, more than two HDs are required and preferibally in different locations in case of disaster recovery.
**The only other “genre of service” worth mentioning in my opinion is DaaS (Data as a Service), which in my mind refers directly to low level access to resources that are availaible online from storage services, this would be either ReST based services like Amazon S3 ReSTful interface or RDF based storage systems that have distilled triples and a SPARQL interface? But this is a cloud still on the horizon for the repository community.
For the full report on Fedorazon, please visit the Fedorazon Web Site.