Cloud Computing News Digest for September 21st, 2012

I normally publish this over at my Sogeti blog at http://blogs.us.sogeti.com/ccdigest/ but that’s down at the moment so we’re going to my backup copy. I know, the self proclaimed “cloud guy” isn’t in the cloud. Well there’s an old saying that goes something like ‘the cobbler’s children have no shoes’. Smile

I’d say I’m late with this edition but this is developing into enough of a pattern that I think I’m just going to start thinking of monthly as the new weekly J So on to the news…

The Cloud Security Alliance (CSA) and Fujitsu announced the launch of the Big Data Working Group. The intent of this organization is to help the industry by bringing forth best practices for security and privacy when working with big data. They will start focused on research across several industry verticals with their first report due sometime this fall.

At the 2012 CloudOpen conference this past August, Suse announced their OpenStack based enterprise level private cloud solution called amazingly enough “Suse Cloud”. This IaaS based solution would help organizations deploy and manage private clouds with self-service and workload standardization capabilities.

I also found an article about a competitor to OpenStack, Eucalyptus. SearchCloudComputing has published a “deep dive” into using Eucalpytus 3.1. You’ll need to register as a member (its free) to read the full article

In my job, I’m often asked what skills are needed for cloud. This article by Joe McKendrick does a nice job of covering the list. Not just for individuals, but for organizations as well.

When you talk to cloud vendors, they will eventually reference PEU (Power to Energy Utilization) statistics in some way. But as this piece by David Linthicum over at Toolbox.com explains, the real savings are in the ability to adjust to changing needs and in turn, changing our consumption.

Last month the world watched the 2012 Summer Olympics. And it turns out the cloud played a major hand in helping deliver that content around the globe. Windows Azure Media Services helped deliver live and on-demand video content to several broadcasters. Eyes weren’t just on the games as Apica, a vendor of testing and monitoring solutions, monitored various Olympics related web sites and scored them for their uptime and performance.

For this edition I also found a presentation by Adrian Cockcroft of Netflix on the Cassandra (another noSQL database solution) Performance and Scalability on AWS. Even if you don’t plan to use Cassandra, I highly recommend listen to this and picking up what you can of their approach and learnings. The video lasts about an hour.

Pfizer (the drug…. er… pharmaceutical company), also ventured into the world of cloud computing to help with supply chain issues. If you ever wondered about your critical delivery, what about getting lifesaving medicine to patients.

On the Google front, they haven’t been quite. They recently launched the Google Cloud Partner Program, giving them a way to help promote and leverage delivery partners not unlike the programs already in place at Amazon and Microsoft.

Related to topics that are close to my heart, I have a great article on resilient solution engineering from Jesse Robbins at GameDay. Having all this capacity for disaster recovery and failover doesn’t do us much good if we won’t create solutions that can take advantage of it. And on the subject of architecture, just yesterday I ran across this great list of items for architectural principles taken from Will Larson’s “Introduction to Architecting Systems for Scale”. Definitely give this a read.

And to close out this edition, I have an info graphics on enterprise cloud adoption. I’m not a big fan of infographics, but I found this one useful and figured I’d share it with all of you.

Avoiding the Chaos Monkey

Yesterday I was pleased (and nervous) to be presenting at the Heartland Developers Conference in Omaha, NE. I’ve been hoping to present at this event for a couple years and was really pleased that one of my submissions was accepted. Especially given that the topic was more architect/concept then code. It was only my second time presenting this material and the first time for a non-captive audiance. And given that it was the 2pm slot, and only a handful of people fell asleep or left, I’m pretty pleased with how things went.

I’ve posted the deck for my Avoiding the Chaos Monkey presentation so please feel free to take and reuse. I just ask that you give proper credit and I’d love any feedback on it. I received some great feedback from HDC on the material and will be making some updates that show some real world scenarios and how applying the principles covered in this presentation can address them. I spoke to some of these during the presentation, but agreed with my colleague Eric that it would help to have more concrete and visual examples to drive the message home. I’ve already submitted the talk to two upcoming conferences and hopefully it will get accepted at one. Meanwhile, feel free to snag a copy and drop me a comment with any feedback you have!

You don’t really want an SLA!

I don’t often to editorials (and when I do, they tend to ramble), but I felt I’m due and this is a conversation I’ve been having a lot lately. I sit to talk with clients about cloud and one of the first questions I always get is “what is the SLA”? And I hate it.

The fact is that an SLA is an insurance policy. If your vendor doesn’t provide a basic level of service, you get a check. Not unlike my home owners insurance. If something happens, I get a check. The problem is that most of us NEVER want to have to get that check. If my house burns down, the insurance company will replace it. But all those personal mementos, the memories, the “feel” of the house are gone. So that’s a situation I’d rather avoid. What I REALLY want is safety. So install a fire-alarm, I make sure I have an extinguisher in the kitchen, I keep candles away from drapes. I take measures to help reduce the risk that I’ll need to cash my insurance policy.

When building solutions, we don’t want SLA’s. What we REALLY want is availability. So we as the solution owners need to take steps to help us achieve this. We have to weight the cost vs the benefit (do I need an extinguisher or a sprinkler system?) and determine how much we’re wiling to invest in actively working to achieve our own goals.

This is why when I get asked the question, I usually respond by giving them the answer and immediately jump into a discussion about resiliency. What is a service degradation vs an outage? How can we leverage redundancy? Can we decouple components and absorb service disruptions? These are the types of things we as architects need to start considering, not just for cloud solutions but for everything we build.

I continue to tell developers that the public cloud is a stepping stone. The patterns we’re using in the public cloud are lessons learned that will eventually get applied back on premises. As the private cloud becomes less vapor and more reality, the ability to think in these new patterns is what will make the next generation of apps truly useful. If a server goes down, how quickly does your load balancer see this and take that server out of rotation? How do the servers shift workloads?

When working towards availability, we need to take several things in mind.

Failures will happen – how we deal with them is our choice. We can have the world stop, or we can figure out how to “degrade” our solution to keep anything we can going.

How are we going to recover – when things return to normal, how does the solution “catch up” with what happened during the disruption

the outage is less important than how fast we react – we need to know something has gone wrong before our clients call to tell us

We (aka solution/application architects) really need to start changing the conversation here. We need to steer away from SLA’s entirely and when we can’t manage that at least get to more meaningful, scenario based SLA’s. This can mean instead of saying “the email server will be 99% of the time” we switch to “99% of emails will be transmitted within 5 minutes”. This is much more meaningful for the end users and also gives s more flexibility in how we achieve it. And depending on how traffic.

Anyway, enough rambling for now. I need to get a deck that discusses this ready for a presentation on Thursday that only about 20 minutes ago I realized I needed to do. Fortunately, I have an earlier draft of the session and definitely have the passion and knowhow to make this happen. So time to get cracking!

Until next time!

Follow

Get every new post delivered to your Inbox.

Join 1,149 other followers