Showing posts with label Consistency. Show all posts
Showing posts with label Consistency. Show all posts

Thursday, December 5, 2013

Alice in NoSql Land



In the Alice-in-Wonderland of NoSql databases, unavailable means available!

The CAP Theorem, a.k.a. Brewer's Conjecture, holds basically that you can pick any two of "C", "A", or "P", where these stand for:

  • Consistency
  • Availability
  • Partition Tolerance
Consistency is the familiar "C" of relational database's ACID property of transactions. They must be:
  • Atomic
  • Consistent
  • Isolated
  • Durable
In the distributed world, and specifically in the NoSql world, consistency is traded off for availability. Many NoSql databases are never brought down for any kind of maintenance window! But this comes at a cost: the cost of "eventual consistency." 

In many applications, consistency isn't all that crucial. For example, if you've got a social network type of application, does it really matter if the number of "likes" is inconsistent? I've actually seen this fluctuate in real time, and who knows or even cares whether the "likes" are actually being toggled by some user, or this is simply an artifact of the node-cluster getting the consistency propagated throughout. 

But there are many "gotchas" with this new paradigm, and the "availability" meaning is perhaps the most troubling. TheHackerCIO hates it when people create a technical term that is diametrically opposed to common or popular usage, but this is a case where we are stuck with it, and there is nothing to do about it, except to note it carefully.

"Availability" in the CAP theorem is not your ordinary availability!

You might assume that a Server being up and functioning is available, but such is NOT the case. 

In the CAP-theorem, if you can talk to a node in the cluster, it can read and write data. 

So, if your cluster partitions -- say the European nodes get cut off from the North American nodes -- then one way to preserve "availability" would be for all nodes to stop talking to any clients! Then they would be "available," because "if you can talk to a node in the cluster (and you can't), it can read and write data (which you don't want to happen until the partitioning event resolves)." 

I'm not making this up. Here is a quotation from Martin Fowler's NoSql Distilled:
"However, this would mean that if a partition ever occurs in the cluster, all the nodes in the cluster would go down so that no client can talk to a node. By the usual definition of 'available,' this would mean a lack of availability, but this is where CAP's special usage of 'availability' gets confusing. CAP defines 'availability' to mean 'every request received by a non failing node in the system must result in a response.' [Lynch and Gilbert] So a failed, unresponsive node doesn't infer a lack of CAP availability."
So be careful with your "availability."

I Remain,

TheHackerCIO