Tuesday, October 8, 2013

The Half-Hour, High-Availability Cluster!

Last nights Technology Radar Group tutorial was awesome! Kevin Epstein guided us through the live process of spinning up a load balanced HA cluster with an elastic growth of between 2 and 10 instances. He began with a slide presentation classifying all the many products that make up the Amazon Web Services constellation of products. Then, he used the AWS Console to create a base instance with a MySql database and quickly set up a WordPress blog. An awesome blog, that the whole world will want to view, which will result in performance problems for the server!!! Well, OK, you have to use your imagination for that part, but he did set up a basic blog page and displayed the IP of the underlying instance, so we could verify what was going on in the demo.

One particularly interesting twist was his use of Route53 for a simple failover mechanism. He configured this so that it was checking for the health of the server, which in the beginning, had not yet been created! Since non-existence is kind of a "bad health" state, the Route53 failed over to an S3 bucket, and displayed some static content. Once he started the instance, this  no longer failed over and the static content was no longer displayed, since the actual blog replaced it.

Next on the menu was replacing the database embedded in the instance with a Amazon Services database -- that is to say, an RDS service. He used the same MySql database engine for this, and used MySql utilities to dump the data to a flat file and then import it into what was now a clustered, backed-up database!

After the break he configured an ELB and an autoscaling group.  His comments throughout this presentation were most helpful. For instance, he commented on the phenomenon of "flapping," where  the failure heartbeat is set to too short a period. In such cases, one can produce a worse outcome than the brief outage that comes from failover, by repeatedly failing over, and failing back, with the user still experiencing a repeated loss of transactional state.

He also, most helpfully, pointed out that in creating the "Gold Image" instance, from which the cluster would be initiated, you NEVER want to choose the "No Reboot" option! In the normal case, you want to ensure a consistent state on the image you are copying, so it has to be an unbelievably weird scenario where you would want to save the tiny bit of time possible by taking a snapshot without a reboot first. If anyone knows such a scenario, please send it on in, so we can take note of that rare, exceptional condition. In the meantime just leave that checkbox alone!

The cluster he defined was a "Multi-AZ" or Multi-Availability Zone cluster, which in AWS terms is an High Availability cluster. He noted that this was one area where the AWS console was insufficient, and the one had to resort to either the command line, or use a third party tool. The tool he preferred was called ElasticWolf. But there was some kind of issue in seeing the instances and we were under time pressure, so he resorted to using ezautoscaling, which is another third-party tool, and a good one, but one that charges.

I like complex live presentations precisely because they often run into problems: you get to see a problem arise, and a work-around employed! That's invaluable experience, and to get it vicariously is a tremendous boost  -- it gives a beginner some inkling of how to work/struggle with a new product. And what do we do every day, but struggle with technology? Nothing comes easy in the tech world!

For anyone interested, I highly recommend the Los Angeles AWS Users Group, which joined together with the Technology Radar Group us for this tutorial presentation. We are also very lucky to have Eric Hammond's blog Alestic. Eric is a member, and regular attender at the AWS Users Group. If you haven't read Eric's blog, you're missing out! And special thanks go to Kevin Epstein for a masterful presentation. One commenter claimed this was our best Meetup presentation yet, and I agree with him!

Presentation Slides
Screencast of Tutorial (Will be provided a link here, when available)