Last Tuesday (Oct 21, 2008) was the Cloud Computing one-night conference
"CloudCamp" hosted by
Tech Cocktail. My company has some really time consuming web analytics tasks that take days to run, and we're exploring using
GridGain to distribute the work over several servers, so it was a good chance for me to get an acquaintance with this field.
I've included a few photos from the event, that come from the conference's
Flickr photogallery.
The meeting was alright. I didn't see anyone I knew there but had reasonably enjoyable small talk with mostly non-technical people during the drinking hour.
It wasn't a conference in any traditional sense of the word. There were no scheduled topic, no scheduled speakers. They did use the format used by O'reilly's
"Foo Camps". They have a grid of sessions and meeting rooms on a whiteboard. All the squares are empty. Then they ask everybody who has a topic they are interested in to write it in a square on the board. Presto...you are now the moderator of that session.
I volunteered the topic "Software Engineering and Grid Computing". Eight really smart people showed up, including two physicists from Italy, two doctoral students, a guy from UBS, and a consultant from
CohesiveFT, a Chicago company specializing in cloud computing.
Physics people have been doing grid computing for years, so they were levels above me. But interestingly, a lot of their problems have to do with resource sharing. There can be other research teams that also want to use the grid, and maybe they don't want the nodes installed with the same software you do, and the people with the biggest grants tend to win out.
The most interesting guy there was the consultant from CohesiveFT, Pat Kerpan. He had two pieces of memorable wisdom. (1) Rule of thumb: count on a 30% performance penalty imposed from the overhead of grid enabling your problem. (2) It's easier to bring the computing to the data than to bring the data to the computing.
He talked about the stuff their company uses for their clients called "Open Source Sun Hypervisor". This has an interface that allows you to trick out your nodes with whatever setup you want (e.g. pick and choose between java, tomcat, flavors of linux, struts, etc.) and get a multinode environment all set up in six minutes.
Several of the people spoke knowingly of
"paravirtualization". Pat distinguished between problems that are "compute bound" vs. "data bound".
A few people referred to
Hadoop. No one had ever heard of GridGain, but I don't think that Java development was strongly represented in that collection of people.
People have different aims in cloud computing. For a lot of people, they don't mind if a lot of virtual nodes are spread over one machine.
Virtualization was recommended as a convention even when you are doing one node per machine.
In many commercial applications, 4 virtual nodes per machine is typical.
Many people responded to my description of what we are trying to do at iCrossing with "why don't you just use
Amazon's cloud computing"? To hear them describe it, Amazon gives you the flexibility to do whatever you want.
I could have attended some of the other sessions if I wanted to stay two more hours, but I split after mine. The other sessions were on pretty soft- or business-focussed topics. One guy led a session called “What color is your cloud?” There were two Microsoft people who found each other and made their own Microsoft-focussed session ("Cloud Computing in Windows 8 and SQL Server").