I have been spending a large amount of my time lately consulting and talking about cloud computing and specifically Amazon Web Services. This is mainly because of our infrastructure sitting completely in the cloud. One of the questions that always comes up is "How do you know everything is going ok?". Let me start with saying that it does not matter if you house your application in the cloud, in a co-location facility, in the closet or on the desk next to you... it is important that if you are supporting a production application that you know what is going on with its health. This will allow you to make informed decisions with any action you need to take BEFORE (hopefully) your customers are impacted. We have several tools that we have been using on DigitalChalk to help in the monitoring of our applications. One of our core components for monitoring is Hyperic. We have customized the Hyperic tools to allow us to monitor all of our servers soup to nuts. We have spent alot of time configuring Hyperic solutions on our EC2 instances to help us have some visibility into the AWS platform and our application. The good news is that Hyperic has just released CloudStatus into beta.
For quick updates on how AWS is performing, we have been using the RSS feeds from Amazon at http://status.aws.amazon.com/ . With CloudStatus, we really have a better view of the historical data of the services. I am looking forward to another tool in the toolbelt and I hope it will really help to improve our monitoring capabilities and streamline our diagnosis. I am really interested to see if Hyperic will come out with a dashboard that will specifically target a single AWS account's health. That would be really nice because we have seen before that even when AWS is having trouble in some areas, others are fine.