For no particular reason, I have been thinking a fair amount recently about the CAP theorem and how the basic problem that it presents is worked around in various ways by contemporary and even ancient systems.
I remember years ago as a freshly minted SOA zealot, I was confused by the pushback that I got from mainframe developers who insisted that client applications needed to have more control over how services were activated and how they worked. I always thought that good, clean service API design and "separation of concerns," along with developer education and evangelism would make this resistance go away. I was wrong.
I still think the basic idea of SOA (encapsulation and loose coupling) is correct; but once you shatter the illusion of the always-available, always-consistent central data store, you need to let the client do what it needs to do. The whole system has to be a little more "client-oriented."
The Dynamo Paper provides a great example of what I am talking about here. I am not sure it is still an accurate description of how Amazon's applications work; but the practical issues and approaches described in the paper are really instructive. According to the paper, Dynamo is a key-value store designed to deliver very high availability but only "eventual consistency" (i.e., at any given time, there may be multiple, inconsistent versions of an object in circulation and the system provides mechanisms to resolve conflicts over time). For applications that require it, Dynamo lets clients decide how to resolve version conflicts. To do that, services maintain vector clocks of version information and surface what would in a "pure" SOA implementation be "service side" concerns to the client. To add even more horror to SOA purists, the paper also reports that in some cases, applications that have very stringent performance demands can bypass the normal service location and binding infrastructure - again, letting clients make their own decisions. Finally, they even mention the ability of clients to tune the "sloppy quorum" parameters that determine effective durability of writes, availability of reads and incidence of version conflicts.
Despite the catchy title for this post, I don't mean to suggest that SOA was a bad idea or that we should all go back to point-to-point interfaces and tight coupling everywhere. What I am suggesting is that just having clean service APIs at the semantic, or "model" level and counting on the infrastructure to make all decisions on behalf of the client doesn't cut it in the post-CAP world. Clients need to be allowed to be intelligent and engaged in managing their own QoS. The examples above illustrate some of the ways that can happen. I am sure there are lots of others. An interesting question is how much of this does it make sense to standardize and what ends up as part of service API definitions. Dynamo's context is a concrete example. Looks like it just rides along in the service payloads so is effectively standardized into the infrastructure.
No comments:
Post a Comment