Convenience Over Correctness

July 1st, 2008  |  Published in column, integration, messaging, REST, RPC  |  Bookmark on

My latest Internet Computing column, “Convenience Over Correctness,” (PDF) is now available. It continues the exploration of problems with RPC-oriented distributed programming approaches that I’ve been writing about in each of my three prior columns this year, as well as in columns from years gone by and in the erlang-questions mailing list.

For years we’ve known RPC and its descendants to be fundamentally flawed, yet many still willingly use the approach. Why? I believe the reason is simply convenience. Regardless of RPC’s well-understood problems, many developers continue to go down the RPC-oriented path because it conveniently fits the abstractions of the popular general-purpose programming languages they limit themselves to using. Making a function or method call to a remote or distributed function, object, or service appear just like any other function or method call allows such developers to stay within the comfortable confines of their language. Those who choose this approach essentially decide that developer convenience and comfort is more important than dealing with hard distribution issues like latency, concurrency, reliability, scalability, and partial failure.

Is this convenience for the developer the right thing to focus on? I really, really don’t think it is. There are ways of developing robust distributed applications that don’t require code-generation toolkits, piles of special code annotations, or brittle enterprisey frameworks. Perhaps the wonderful programming language renaissance we’re currently experiencing will help us to finally see the light and put tired old broken abstractions like RPC permanently out to pasture.

Defending Something Other Than RPC

May 24th, 2008  |  Published in CORBA, distributed systems, HTTP, messaging, objects, RPC  |  Bookmark on

Josh Haberman takes me to task for my previous posting:

Steve Vinoski has come out very vocally against RPC in the last few days…

Actually, I’ve been saying similar things for years now, Josh, not just the last few days. For example, I noted problems with RPC in my Mar/Apr 2008 IEEE Internet Computing column entitled “Demystifying RESTful Data Coupling.” I noted problems with RPC in my Sep/Oct 2005 column entitled “RPC Under Fire.” I noted problems with RPC in my Jul/Aug 2002 column entitled “Web Services Interaction Models, Part 2: Putting the ‘Web’ Into Web Services.”

His blog entry basically makes fun of Cisco for inventing/releasing another RPC system. It’s not clear exactly what he thinks they should have done instead.

I think my posting pretty clearly implies that Cisco should have avoided writing their own and instead should have reused something that already exists.

What is strange about this criticism is that tons of technology companies have developed their own RPC system — Facebook and Cisco publicly, and other technology companies I am familiar with in a not-so-public way. Guess what: large commercial distributed systems are built largely on RPC. Is he arguing that all of the engineers at these companies simultaneously got the bad idea of investing in something they don’t need? If RPC is such a bad idea, then why is everybody doing it?

Is everybody really doing it? Are large commercial distributed systems really built largely on RPC? I’ve seen some non-trivial CORBA-based deployments over the years, but in my experience large systems are built using approaches other than RPC. Like the Web, which isn’t RPC. Like email, which isn’t RPC. Like pub/sub enterprise messaging systems, which aren’t RPC.

Let’s consider the definition of what an RPC actually is. The term is often misused to mean “a synchronous call to another system over the network.” This is not what an RPC is. For example, an HTTP request is synchronous, but it is not an RPC. RPC, rather, is a specific approach for developing networked applications where local calls wrap and hide operations that happen to be carried out on another system across the network. For starters, let’s check Wikipedia:

Remote procedure call (RPC) is a technology that allows a computer program to cause a subroutine or procedure to execute in another address space (commonly on another computer on a shared network) without the programmer explicitly coding the details for this remote interaction. That is, the programmer would write essentially the same code whether the subroutine is local to the executing program, or remote.

Next, let’s check RFC 707, where RPC comes from, in which James E. White specifically proposed a procedure call model for networked applications designed to hide the network, and thereby allow developers to use familiar approaches to developing applications that happened to perform network operations. Quoting from that RFC:

Ideally, the goal of both the Protocol and its accompanying RTE is to make remote resources as easy to use as local ones. Since local resources usually take the form of resident and/or library subroutines, the possibility of modeling remote commands as “procedures” immediately suggests itself. The Model is further confirmed by the similarity that exists between local procedures and the remote commands to which the Protocol provides access. Both carry out arbitrarily complex, named operations on behalf of the requesting program (the caller); are governed by arguments supplied by the caller; and return to it results that reflect the outcome of the operation. The procedure call model thus acknowledges that, in a network environment, programs must sometimes call subroutines in machines other than their own.

and also:

“The procedure call model would elevate the task of creating applications protocols to that of defining procedures and their calling sequences. It would also provide the foundation for a true distributed programming system (DPS) that encourages and facilitates the work of the applications programmer by gracefully extending the local programming environment, via the RTE, to embrace modules on other machines.” This integration of local and network programming environments can even be carried as far as modifying compilers to provide minor variants of their normal procedure-calling constructs for addressing remote procedures (for which calls to the appropriate RTE primitives would be dropped out).

Josh continues:

Yes, on a network sh*t happens, and no sane RPC system will try to hide this from you.

As you can see from the original definition of RPC, something called an RPC that doesn’t hide the network is, by definition, not an RPC. As I said above, unfortunately the term is often misused as meaning “synchronous messaging,” and that incorrect usage seems to be what Josh is defending. Josh then says:

But then again, I don’t know of any RPC system that tries to hide this from you except possibly CORBA.

That’s not correct either. What CORBA actually does is make everything appear remote, even local objects, but does so in a way that allows object request broker (ORB) implementations to bypass much of the overhead of remote invocations when the ORB knows that a target object is local. Still, not all the overhead can be eliminated due to object lifecycle and method dispatching requirements, meaning that such local calls are typically never as fast as true local calls. DCE also treats services as always being remote, but last I checked it included no local bypass optimizations (though a variant called OODCE once did this, IIRC). But either way, what’s important with these systems is that calls within your code look just like any other calls within your code, whether they’re calling remote operations or not. And that’s RPC.

Regarding versioning problems, Josh says:

But any RPC framework worth its salt makes it possible to have different interface versions interoperate. Adding a new parameter? No problem, old servers simply won’t see it. Completely changing the semantics of your call? No problem — just give the new call a new name.

Yes, Josh, there are generally ways to do versioning in such systems, but they’re not very good. CORBA includes some facilities to help with versioning, but in practice they don’t actually help that much. Both COM and CORBA promoted interface inheritance and runtime interface negotiation (called “narrowing” in CORBA) as a way to do versioning, which works, but only for a restricted set of changes. Add a parameter to an existing call? Sorry, no can do, unless your marshaling format carries complete information for the entire call including parameter names, types, and positions, and also versions each parameter, all of which systems like CORBA, DCOM, and DCE specifically do not do due to the large overhead it adds whether a given application uses it or not, and in CORBA’s case also because of the interference it can cause for local dispatching optimizations. All in all, versioning is hard, not only for RPC, but for distributed systems in general.

Middleware and distributed systems veterans are well aware of the arguments like the ones I’ve made in my blog and other places recently and in various publications over the years; such arguments are generally common knowledge among us, and have been for years.

Cisco’s system is not available yet, but when it comes out, I’m quite certain you’ll find, Josh, that it’s the same old thing, just repackaged in a new box.

Wiger on Erlang-style Concurrency

February 7th, 2008  |  Published in concurrency, erlang, HTTP, messaging  |  Bookmark on

Since a number of people seem to be experimenting with adding Erlang-style concurrency to other languages, Ulf Wiger has written a nice explanation of what Erlang-style concurrency actually is. Definitely informative.

On a related note, I chuckled when I saw this posting from Robert Virding, who helped create Erlang, in the erlang-questions list about a month ago:

After reading the blogs about how good Erlang’s concurrency model is and how we just just made a super implementation of it in XXX I have been led to formulate Virding’s First Rule of Programming:

Any sufficiently complicated concurrent program in another language contains an ad hoc informally-specified bug-ridden slow implementation of half of Erlang.

This is, of course, a mild travesty of Greenspun but I think it is fundamental enough to be my first rule, not the tenth.

I can understand where he’s coming from. When I see the kind of blog postings he’s referring to, I always wonder why they don’t just use Erlang itself instead of trying to reinvent it in another system whose design trade-offs are unlikely to be able to support it. Well, I guess it would make sense in the case where they’re specifically studying various implementations of concurrency, but other than that, I don’t get it.

On another related note, I found the comment below in another blog. It refers specifically to my own use of Erlang:

it’s [sic] be interesting to see how Steve mix [sic] Erlang and HTTP, since Erlang is asynchronous in nature and does not have any “resource” concept and HTTP’s sweet spots are just the opposite

That’s easy. If you want to experience HTTP as implemented in Erlang, just go to the Yaws website, which of course is powered by Yaws itself. As Ulf’s blog posting explains, Erlang message passing is asynchronous, but that really has nothing to do with Erlang’s ability to support protocols like HTTP. For example, when I was learning Erlang, given my CORBA background I wrote a subset of IIOP, and it was really quite easy. Furthermore, Erlang doesn’t need a “resource” concept to support HTTP. A resource in that context is, after all, ultimately just a chunk of code that processes incoming requests, and of course Erlang can do that.

If you want to know more, go pick up Joe Armstrong’s Erlang book — it’s quite excellent.