RPC

QCon London: The Best One Yet

March 18th, 2009  |  Published in concurrency, conferences, erlang, functional, functional programming, RPC, standards  |  Bookmark on Pinboard.in

QCon is always very, very good, but QCon London 2009 last week was the best one yet. Highlights:

  • Ola Bini‘s “Emerging Languages in the Enterprise” track on Wednesday had some great talks, especially Rich Hickey’s Clojure talk. Ola’s deep knowledge and love of programming languages made him the perfect host for this track, and he put together a brilliant lineup.
  • Speaking of Rich, I consider myself very lucky to have gotten to meet and spend a fair amount of time with him. He’s very bright, talented, knowledgeable, and experienced, and both of his talks were outstanding.
  • I introduced Rich to Joe Armstrong at the conference party Wednesday evening and they spent the next few hours talking at length about functional programming, their respective languages, VM implementation issues, concurrency issues, etc. Ola jumped in as well. They also continued the conversation the next day. I just sat back, listened, and learned.
  • Getting to spend time again with Joe was excellent. He always has incredibly useful analyses and opinions to express, and in general is fun to be around and easy to learn from.
  • I also finally got to meet Ulf Wiger, Erlang developer extraordinaire, in person. He’s a laid back guy, quite well-informed and a deep thinker who can cover a wide variety of topics in amazingly useful detail. His talk on multicore programming in Erlang covered cutting edge Erlang development and presented some very difficult concurrency issues.
  • Ulf’s talk, as well as Rich’s second talk, which was on persistent data structures and managed references, were part of Francesco Cesarini‘s “Functional and Concurrent Programming Languages Applied” track on Thursday. I met Francesco, who like Ulf is one of the world’s top Erlang developers, at QCon London last year. He assembled a great track for this conference, with Rich’s and Ulf’s back-to-back talks being way more than enough to sober up any developer who thinks that multicore is not an issue and that today’s methods for dealing with concurrency will continue to work just fine. Best of luck with that!
  • Sir Tony Hoare‘s talk about the null reference being his “billion dollar mistake” was great because of all the detail he recounted from some of the early days of computing. He was both informative and entertaining. I was also impressed with Ulf during this talk, whom Professor Sir Hoare invited to come up to the front and present what turned out to be a pretty convincing argument in favor of the null reference.
  • Paul Downey‘s talk on the downsides of standardization was by far the most humorous talk I heard, perfect to close out the track, but it also presented a number of hard-won useful lessons about the perils of standardization efforts.

As with all QCon conferences, there were a bunch of interesting tracks running in parallel, and unfortunately I still haven’t figured out how to be in multiple places at once. I had to miss Michael Nygard‘s talk, for example, because my own talk got moved to the same time slot as his.

My talk (PDF) covered the history of RPC, why it got to be the way it was, and why the forces that created it really aren’t all that viable anymore.

The final conference panel was by far the most inventive panel I’ve ever been on. Modeled after the British game show “It’s a Bullseye!” and hosted by none other than Jim Webber, it had contestants from the audience throwing darts to become eligible for a prize. Once a contestant became eligible, Jim would ask the panel — Michael Nygard, Ian Robinson, Martin Fowler, and me — to answer a question submitted by conference attendees either earlier during the conference or live via Twitter. Based on our answers, audience members held up either a green card if they liked the answers or a red one if they didn’t, and if the majority was green, the contestant would win a book. The questions were hard! We had only two minutes each to answer, which for some questions seemed like an eternity but for most was way too short. Anyway, it was great fun, and given how many there were in the audience after three grueling conference days and how much they seemed to be enjoying themselves, it worked very, very well.

If you have any interest at all in leading edge software and computing topics being presented by the world’s most knowledgeable speakers in a fun atmosphere, go to QCon. I guarantee you won’t be disappointed.

The Technology Adoption Side of RPC and REST

September 3rd, 2008  |  Published in column, enterprise, innovation, REST, RPC, technology adoption  |  Bookmark on Pinboard.in

My latest Internet Computing column has been available since last Friday. It’s entitled “RPC and REST: Dilemma, Disruption, and Displacement” (PDF, HTML) and like my previous 2008 columns, it explores another angle of the “RPC vs. REST” debate.

Since previous columns have covered many of the technical angles, this time I present the debate from the technology adoption angle. As the abstract for the column says, many technologists tend to treat such debates as if they’re purely technical, but of course they’re never that black-and-white. What’s often behind some of the raging “technical” debates we’ve all seen or experienced is simply the difference between the arguing parties in their relative positions along the Technology Adoption Lifecycle curve. Nobody would be surprised at a disagreement over technology between someone classified as an early adopter or visionary (from the far left of the curve) and someone classified as a technology skeptic (from the far right), yet we always seem surprised when two people whose preferences aren’t too far apart on the curve — from the opposite edges of the mainstream band in the middle of the bell curve, for example — don’t see eye to eye, despite the fact that this sort of scenario is quite common. Even small differences in goals for adopted technologies and desired risk/reward trade-offs, along with the inevitable hidden and unstated assumptions resulting from such factors, can cause vigorous debate about what technology or approach is best for a given situation.

When it comes to published explanations of how innovation works and how technologies move along the adoption curve, my favorite author by far is Clayton Christensen. IMO all developers should study and learn from his books, specifically The Innovator’s Dilemma, The Innovator’s Solution, and Seeing What’s Next. All are amazingly insightful works that will open your eyes to how real-life markets react to technological change and advancement.

In this column I try to view and classify the “RPC vs. REST” debate based on Christensen’s theories about innovation and technology adoption. I hope you find it interesting, and as always, I welcome all constructive comments.

Spot On

July 28th, 2008  |  Published in design, distributed systems, REST, RPC, SOA, WS-*  |  Bookmark on Pinboard.in

Eric Newcomer gives us his take on my recent articles (1, 2, 3, 4) and blog postings (1, 2, 3, 4) about RPC, REST, and programming languages:

Anyway, after carefully reading the article and blog entries, I believe Steve is not against RPC per se. He wants people to think before just automatically using it because it’s convenient.

Exactly!

Also spot on are the following postings:

The beauty common to all these postings is the breadth, depth, and variety of thinking and reasoning they present. There’s a lot to read, but if you’re interested in critical thinking about the design and construction of distributed systems I encourage you to read them all the way through, including the comments, and to follow the links they offer as well.

Protocol Buffers: Leaky RPC

July 13th, 2008  |  Published in distributed systems, RPC, services  |  Bookmark on Pinboard.in

Mark Pilgrim tells us why Protocol Buffers are so nice. Notice, though, that everything he writes focuses entirely on their form and structure as messages. If you focus only on that perspective, then sure, they’re better than what many could come up with if they were rolling their own. In fact, if Google had stopped there, I think Protocol Buffers could be a superb little package.

But they didn’t stop there. No, they had to include the fundamentally flawed remote procedure call. Well, sort of, anyway:

The Protocol Buffer library does not include an RPC implementation. However, it includes all of the tools you need to hook up a generated service class to any arbitrary RPC implementation of your choice. You need only provide implementations of RpcChannel and RpcController.

Why ruin a perfectly good messaging format by throwing this RPC junk into the package? What if I want to send these messages via some other means, such as message queuing, for example? Do I have to pay for this RPC code if I don’t need it? If my messages don’t include service definitions, do I avoid all that RPC machinery?

In my previous post I talked about the message tunneling problem, where data that doesn’t fit the distributed type system are forced through the system by packing them into a type such as string or sequence of octets. Since Protocol Buffers require you to “hook up a generated service class to any arbitrary RPC implementation of your choice,” it’s likely that you’re going to run into this tunneling problem. For example, if you want to send this stuff over IIOP, you’re likely going to send the marshaled protobufs as Common Data Representation (CDR) sequences of octet. You’re thus unavoidably paying for marshaling twice: once at the protobuf level for the protobuf itself, and then again at the CDR level to marshal the sequence of octet containing the protobuf. Any worthwhile IIOP/CDR implementation will be very fast at marshaling sequences of octet, but still, overhead is overhead.

But there are other problems too. What about errors? If something goes wrong with the RPC call, how do I figure that out? The answer appears to be that you call the RpcController to see if there was a failure, and if so, call it again to get a string indicating what the failure was. A string? This implies that I not only have to write code to convert exceptions or status codes from the underlying RPC implementation into strings, but also write code to convert them back again into some form of exception, assuming my RPC-calling code wants to throw exceptions to indicate problems to the code that calls it.

What about idempotency? If something goes wrong, how do I know how far the call got? Did it fail before it ever got out of my process, or off my host? Did it make it to the remote host? Did it make it into the remote process, but failed before it reached the service implementation? Or did it fail sometime after the service processed it, as the response made its way back to me? If the call I’m making is not idempotent, and I want to try it again if I hit a failure, then I absolutely need to know this sort of information. Unfortunately, Protocol Buffers supplies nothing whatsoever to help with this problem, instead apparently punting to the underlying RPC implementation.

Still more problems: the RpcController offers methods for canceling remote calls. What if the underlying RPC package doesn’t support this? Over the years I’ve seen many that don’t. Note that this capability impacts the idempotency problem as well.

Another question: what about service references? As far as I can see, the protobuf language doesn’t support such things. How can one service return a message that contains a reference to another service? I suspect the answer is, once again, data tunneling — you would encode your service reference using a form supported by the underlying RPC implementation, and then pass that back as a string or sequence of bytes. For example, if you were using CORBA underneath, you might represent the other service using a stringified object reference and return that as a string. Weak.

All in all, the Protocol Buffers service abstraction is very leaky. It doesn’t give us exceptions or any ways of dealing with failure except a human-readable string. It doesn’t give us service references, so we have no way to let one service refer to another within a protobuf message. We are thus forced to work in our code simultaneously at both the Protocol Buffers level and also at the underlying RPC implementation level if we have any hope of dealing with these very-real-world issues.

My advice to Google, then, is to just drop all the service and RPC stuff. Seriously. It causes way more problems than it’s worth, it sends people down a fundamentally flawed distributed computing path, and it takes away from what is otherwise a nice message format and structure. If Google can’t or won’t drop it, then they should either remove focus from this aspect by relegating this stuff to an appendix in the documentation, or if they choose to keep it all at the current level of focus, then they should clarify all the questions of the sort I’ve raised here, potentially modifying their APIs to address the issues.

Protocol Buffers: No Big Deal

July 11th, 2008  |  Published in distributed systems, HTTP, RPC, services  |  Bookmark on Pinboard.in

I’ve gotten a few emails asking me to blog my opinion of Google’s Protocol Buffers. Well, I guess I pretty much share Stefan’s opinion. I certainly don’t see this stuff providing anything tremendously innovative, so as with Cisco Etch, it seems to me to be just another case of NIH.

Ted Neward already wrote a pretty thorough analysis — it’s almost 0.85 Yegges in length! — so I’ll just refer you to him. There are at least two important points he made that bear repeating, though:

Which, by the way, brings up another problem, the same one that plagues CORBA, COM/DCOM, WSDL-based services, and anything that relies on a shared definition file that is used for code-generation purposes, what I often call The Myth of the One True Schema.

Indeed. Usually when one points this out, those who disagree with you come back with, “Oh, that doesn’t matter — you can just send whatever data you want as a string or as an array of bytes!” Having been forced to do just that numerous times back in my CORBA days, I know that’s not a good answer. You have a bunch of distributed infrastructure trying to enforce your One True Schema Type System, yet you go the extra mile to tunnel other types that don’t fit that schema through it all, and the extra layers can end up being especially complicated and slow.

The second point I’ll quote from Ted says it all:

Don’t lose sight of the technical advantages or disadvantages of each of those solutions just because something has the Google name on it.

Most excellent advice.

There’s one last thing I’ll quote. This one’s not from Ted, but directly from the Protocol Buffers documentation:

For example, you might implement an RpcChannel which serializes the message and sends it to a server via HTTP.

Sigh.

[Update: more opinions — and some questions — in my next post.]