Showing posts with label NoSql. Show all posts
Showing posts with label NoSql. Show all posts

Monday, May 5, 2014

Jepson & Distributed DataStores

Kyle Kingsbury is doing an amazing job with his Jepson project. TheHackerCIO has been long disturbed by the tendency for people to make these assertions and claims without the experimental evidence to back them up or provide an assessment basis for them.

Especially in the database world.

Here are a handful of the problems:

I can't tell you how many time's I've heard, "Oh, in the inner-join using RDBMS X, a nested-loop algorithm will of course perform better depending on the which table is the outer and which is the inner."

No doubt.

But these DBMSs have an optimizer. They have tables full of statistics about the data, presumable updated on a regular basis. These vendors have had 20 years to tweak optimizations. Yet, the documentation gives no indication as to whether their "optimizer" can pick the right outer table and inner table, or whether you must explicitly pick the right one yourself.

So lots of people just assume that the optimizer can/will do this. Which isn't unreasonable.

But the days have come where things need to be specified tighter.

We simply need clear black/white, preferably not greatly hedged,  statements in the documentation. Statements that can be tested. Verified. Proven. Or disproven.

The newer world of NoSql is no exception to this rule or problem.

But Kyle has been there.

Kyle got interested in understanding the issues around the NoSql databases. But he did things the right way: he set up a controlled environment, and began systematically testing, examining, and proving out how the CAP theorem implications actually work in a partitioning environment. This led to a number of surprises for the vendors, ... not to mention the users???

You can take a look at his full Jepson Project here. He's tested Cassandra (My current focus), Redis, Kafka, NuoDb, Zookeeper, Riak, Mongo, Postgress, possibly others ...

To get a proper sense for this correct, test-based approach, recommend this.  Here are just a few enticing flavor notes, taken from a section to please devote your most careful attention, entitled, "Testing Partitions":

  • Theory bounds a design space, but real software may not achieve those bounds. We need to test a system's behavior to really understand how it behaves.
  • To cause a partition, you'll need a way to drop or delay messages: for instance, with firewall rules. 
  • Running these commands repeatably on several hosts takes a little bit of work.

Work might be a necessary evil. But understanding isn't going to come without it. Or without actually, experimental testing.

In this article, you will see exactly what to set up to get started with your own multi-node, partition-able, experimental test-bed, within which you can see how your NoSql is going to behave.

Because there's no short-cut.

Or, as earlier time might have put it,

There is no royal road to enlightenment.

I Remain,


Tuesday, December 17, 2013

When a Delete is a Write!

So Cassandra -- a No-Sql database -- has a few peculiarities that might take newbies by surprise. One of them is that deletion involves a write!

Before we take a brief look at that, do you know the story of Cassandra from Greek mythology? Cassandra was so beautiful that Apollo wanted to have carnal knowledge of her. She refused. Consequently, she was cursed by Apollo with prophesying the truth, yet with no-one believing it. Personally, TheHackerCIO knows how she felt. All the time I tell the truth, but it seems that very few actually believe it. It's enough to drive one crazy.

I wonder at this choice of mascot for the NoSql database which trades off consistency for availability per Brewer's Conjecture (A.K.A., "The CAP Theorem"). Is it that Cassandra will always return the truth, but we the DBAs won't believe it?

Well, leaving off the speculation, let's return to the peculiarity mentioned before: how can a delete be a write operation?

Remember, Cassandra uses an immutable data model. Data just continues to be written out to represent all changes. One consequence of this is that updates and inserts really are interchangeable. They call this the Cassandra "UpSert," because if you insert and a row with that primary key already exists, then it simply becomes an update. Conversely, if you update a row and the primary key involved doesn't exist, Cassandra will simply insert it. That is, either way, you will "UpSert" a row.

Another consequence of the immutable data model is that delete operations are really just "marking for deletion." We're all familiar with this from the file-system, but to have a database that does this adds a few wrinkles. For instance, you now have to deal with "compaction," where the deleted data element no longer remains within the working set of data elements.

So, for people from the relational database world -- and aren't we all -- you need to spend a little time wrapping your head around the world of NoSql in general, and Cassandra in particular.

As you do so,

I Remain,


Thursday, December 5, 2013

Alice in NoSql Land

In the Alice-in-Wonderland of NoSql databases, unavailable means available!

The CAP Theorem, a.k.a. Brewer's Conjecture, holds basically that you can pick any two of "C", "A", or "P", where these stand for:

  • Consistency
  • Availability
  • Partition Tolerance
Consistency is the familiar "C" of relational database's ACID property of transactions. They must be:
  • Atomic
  • Consistent
  • Isolated
  • Durable
In the distributed world, and specifically in the NoSql world, consistency is traded off for availability. Many NoSql databases are never brought down for any kind of maintenance window! But this comes at a cost: the cost of "eventual consistency." 

In many applications, consistency isn't all that crucial. For example, if you've got a social network type of application, does it really matter if the number of "likes" is inconsistent? I've actually seen this fluctuate in real time, and who knows or even cares whether the "likes" are actually being toggled by some user, or this is simply an artifact of the node-cluster getting the consistency propagated throughout. 

But there are many "gotchas" with this new paradigm, and the "availability" meaning is perhaps the most troubling. TheHackerCIO hates it when people create a technical term that is diametrically opposed to common or popular usage, but this is a case where we are stuck with it, and there is nothing to do about it, except to note it carefully.

"Availability" in the CAP theorem is not your ordinary availability!

You might assume that a Server being up and functioning is available, but such is NOT the case. 

In the CAP-theorem, if you can talk to a node in the cluster, it can read and write data. 

So, if your cluster partitions -- say the European nodes get cut off from the North American nodes -- then one way to preserve "availability" would be for all nodes to stop talking to any clients! Then they would be "available," because "if you can talk to a node in the cluster (and you can't), it can read and write data (which you don't want to happen until the partitioning event resolves)." 

I'm not making this up. Here is a quotation from Martin Fowler's NoSql Distilled:
"However, this would mean that if a partition ever occurs in the cluster, all the nodes in the cluster would go down so that no client can talk to a node. By the usual definition of 'available,' this would mean a lack of availability, but this is where CAP's special usage of 'availability' gets confusing. CAP defines 'availability' to mean 'every request received by a non failing node in the system must result in a response.' [Lynch and Gilbert] So a failed, unresponsive node doesn't infer a lack of CAP availability."
So be careful with your "availability."

I Remain,


Wednesday, November 13, 2013

An Evening's Evangelism

Last night was spent playing hookey from the Geeky Book club. But only because a particularly special speaker was in town. Patrick McFadin, chief Evangelist for Apache Cassandra was speaking at DreamWorks in Glendale.

So, TheHackerCIO slogged through an hour and a half of LA traffic to get out to Glendale in time to see the talk. Not to mention hearing it.

Patrick is a good presenter, so the talk was well organized and interesting. His purpose was to convince us that C* [the semi-official abbreviation for Cassandra] was the best persistence tier for your application.

He predicated this on the tunable consistency available in C*; pointing out that if you were willing to specify ALL, and take the performance hit, you could construct the most consistent distributed database system possible. One where every node had to acknowledge before an operation completed.

The talk was too long to go too in-depth, but I was particularly interested by the architecture of writing all files out immutably. Even compaction is accomplished by reading in the fragmented files and writing a new compressed one. So, in theory, you could always recover -- even from programmatic database corruption. Ideally, you use a snapshot to do point-in-time recovery, followed by writing a script to extract "post-point-in-time" updates from the files and apply it where required.  

He mentioned that the joke among C* cognoscenti is that CQL has a UPSERT statement, because update and insert are so very similar. If a row doesn't exist, update will insert it and if it exists insert will replace the data in it! UPSERT is a fun way to remember this similarity of statements.

Patrick also pointed out that Netflix -- the poster boy for C* -- has just released the Chaos Monkey for C*! He challenged the Mainframe person attending to introduce the Chaos Monkey to the Mainframe systems, and see how they compare in terms of failover and availability.  If you don't know about the Chaos monkey, tomorrow I'll fill you in on it. Because it's important.

To summarize his talk, I liked his zinger the best: Use Oracle to count your money; Use Cassandra to make it.

I Remain,


Tuesday, November 5, 2013

Cassandra Last Night at the TRG

Cassandra was the topic at TRG last night.

That is to say, Apache Cassandra. I'm not clear why the project chose to refer to themselves by the name of a Greek prophetess who was doomed to always prophesying correctly, but also to never being believed.

Perhaps the the eventual consistency model?

Still, it doesn't seem like the greatest PR approach and it doesn't seem like the Big Data initiatives would like to think of their correct insights always being disregarded.

But such is the name of the product.

Our presenter, Adrian Rodriguez, did a nice hands-on tutorial where he built up a data model for a Social web application centered around dog photos. He provided a github account where the full blown application can be browsed.

He also pointed us to a very helpful consistency calculator website, where the implications of your consistency level choice are clearly shown: Cassandra Parameters for Dummies.

Adrian recommended the very sound policy of defining calls in quorum and then relaxing this only where necessary, in keeping with the dictum: "don't prematurely optimize."

I also liked his way of explaining that Cassandra databases grow out left to right, with everything attaching to the primary key as a new column, and with all the join overhead done upfront at update time in all the other relevant rows; in contrast to the Relational Model, where databases grow top to bottom as new rows are added. This is an excellent way for beginners to start wrapping their heads around this NoSql database.

Tonight is the Java Users group, so a report will be in order tomorrow on Groovy.

Full details of the presentation may be read here.

I Remain,