Defending Something Other Than RPC
May 24th, 2008 | Published in CORBA, distributed systems, HTTP, messaging, objects, RPC | 11 Comments | Bookmark on Pinboard.in
Josh Haberman takes me to task for my previous posting:
Steve Vinoski has come out very vocally against RPC in the last few days…
Actually, I’ve been saying similar things for years now, Josh, not just the last few days. For example, I noted problems with RPC in my Mar/Apr 2008 IEEE Internet Computing column entitled “Demystifying RESTful Data Coupling.” I noted problems with RPC in my Sep/Oct 2005 column entitled “RPC Under Fire.” I noted problems with RPC in my Jul/Aug 2002 column entitled “Web Services Interaction Models, Part 2: Putting the ‘Web’ Into Web Services.”
His blog entry basically makes fun of Cisco for inventing/releasing another RPC system. It’s not clear exactly what he thinks they should have done instead.
I think my posting pretty clearly implies that Cisco should have avoided writing their own and instead should have reused something that already exists.
What is strange about this criticism is that tons of technology companies have developed their own RPC system — Facebook and Cisco publicly, and other technology companies I am familiar with in a not-so-public way. Guess what: large commercial distributed systems are built largely on RPC. Is he arguing that all of the engineers at these companies simultaneously got the bad idea of investing in something they don’t need? If RPC is such a bad idea, then why is everybody doing it?
Is everybody really doing it? Are large commercial distributed systems really built largely on RPC? I’ve seen some non-trivial CORBA-based deployments over the years, but in my experience large systems are built using approaches other than RPC. Like the Web, which isn’t RPC. Like email, which isn’t RPC. Like pub/sub enterprise messaging systems, which aren’t RPC.
Let’s consider the definition of what an RPC actually is. The term is often misused to mean “a synchronous call to another system over the network.” This is not what an RPC is. For example, an HTTP request is synchronous, but it is not an RPC. RPC, rather, is a specific approach for developing networked applications where local calls wrap and hide operations that happen to be carried out on another system across the network. For starters, let’s check Wikipedia:
Remote procedure call (RPC) is a technology that allows a computer program to cause a subroutine or procedure to execute in another address space (commonly on another computer on a shared network) without the programmer explicitly coding the details for this remote interaction. That is, the programmer would write essentially the same code whether the subroutine is local to the executing program, or remote.
Next, let’s check RFC 707, where RPC comes from, in which James E. White specifically proposed a procedure call model for networked applications designed to hide the network, and thereby allow developers to use familiar approaches to developing applications that happened to perform network operations. Quoting from that RFC:
Ideally, the goal of both the Protocol and its accompanying RTE is to make remote resources as easy to use as local ones. Since local resources usually take the form of resident and/or library subroutines, the possibility of modeling remote commands as “procedures” immediately suggests itself. The Model is further confirmed by the similarity that exists between local procedures and the remote commands to which the Protocol provides access. Both carry out arbitrarily complex, named operations on behalf of the requesting program (the caller); are governed by arguments supplied by the caller; and return to it results that reflect the outcome of the operation. The procedure call model thus acknowledges that, in a network environment, programs must sometimes call subroutines in machines other than their own.
and also:
“The procedure call model would elevate the task of creating applications protocols to that of defining procedures and their calling sequences. It would also provide the foundation for a true distributed programming system (DPS) that encourages and facilitates the work of the applications programmer by gracefully extending the local programming environment, via the RTE, to embrace modules on other machines.” This integration of local and network programming environments can even be carried as far as modifying compilers to provide minor variants of their normal procedure-calling constructs for addressing remote procedures (for which calls to the appropriate RTE primitives would be dropped out).
Josh continues:
Yes, on a network sh*t happens, and no sane RPC system will try to hide this from you.
As you can see from the original definition of RPC, something called an RPC that doesn’t hide the network is, by definition, not an RPC. As I said above, unfortunately the term is often misused as meaning “synchronous messaging,” and that incorrect usage seems to be what Josh is defending. Josh then says:
But then again, I don’t know of any RPC system that tries to hide this from you except possibly CORBA.
That’s not correct either. What CORBA actually does is make everything appear remote, even local objects, but does so in a way that allows object request broker (ORB) implementations to bypass much of the overhead of remote invocations when the ORB knows that a target object is local. Still, not all the overhead can be eliminated due to object lifecycle and method dispatching requirements, meaning that such local calls are typically never as fast as true local calls. DCE also treats services as always being remote, but last I checked it included no local bypass optimizations (though a variant called OODCE once did this, IIRC). But either way, what’s important with these systems is that calls within your code look just like any other calls within your code, whether they’re calling remote operations or not. And that’s RPC.
Regarding versioning problems, Josh says:
But any RPC framework worth its salt makes it possible to have different interface versions interoperate. Adding a new parameter? No problem, old servers simply won’t see it. Completely changing the semantics of your call? No problem — just give the new call a new name.
Yes, Josh, there are generally ways to do versioning in such systems, but they’re not very good. CORBA includes some facilities to help with versioning, but in practice they don’t actually help that much. Both COM and CORBA promoted interface inheritance and runtime interface negotiation (called “narrowing” in CORBA) as a way to do versioning, which works, but only for a restricted set of changes. Add a parameter to an existing call? Sorry, no can do, unless your marshaling format carries complete information for the entire call including parameter names, types, and positions, and also versions each parameter, all of which systems like CORBA, DCOM, and DCE specifically do not do due to the large overhead it adds whether a given application uses it or not, and in CORBA’s case also because of the interference it can cause for local dispatching optimizations. All in all, versioning is hard, not only for RPC, but for distributed systems in general.
Middleware and distributed systems veterans are well aware of the arguments like the ones I’ve made in my blog and other places recently and in various publications over the years; such arguments are generally common knowledge among us, and have been for years.
Cisco’s system is not available yet, but when it comes out, I’m quite certain you’ll find, Josh, that it’s the same old thing, just repackaged in a new box.
May 24th, 2008 at 7:20 pm (#)
Steve,
I would like nothing more than for this disagreement to boil down to nothing more than a terminology mismatch. I am more than happy to accept public scorn and humiliation if indeed I’m just misusing terms. I don’t really know or care what the most widely-accepted definition of RPC is, so let’s set that aside. What I’m really talking about is systems for doing request/reply over a network with structured data, so I’ll just use that term (“request/reply”) instead of RPC.
We seem to agree that trying to hide the fact that remote invocations are different than local ones is a bad idea, so we can set that aside as well.
I also didn’t mean to imply that this was a new position for you — just something that I’d seen you express multiple times within the last few days.
All that said, it’s still not clear to me what you think Cisco should have done instead. You say they should they “should have reused something that already exists.” Ok: name for me a viable open-source request/reply stack they should have used instead of inventing their own.
“They should use REST” is not an answer. REST is not software or even a specific protocol/format, it’s an architectural pattern. I’m asking very specifically for an example of actual software they should use instead to do programmatic request/reply over a network instead of writing their own.
I think we both agree that SOAP, CORBA, and other widely available solutions leave much to be desired. So what’s out there that serves this need?
May 24th, 2008 at 8:19 pm (#)
@Josh: well, I do care about the terminology, because when I say “RPC” I really mean RPC, as in the definitions I pointed to. It’s important because what I’ve written depends precisely on that definition, and changing the definition changes the whole argument. Now that I’ve made that point, though, sure, we can put it aside for now.
BTW, why does Cisco have to use something that’s open source? That’s one assumption you’re making that might not be true. Another assumption you seem to be making based on the comments you put on your own blog entry, as well as your comments above, is that they need something other than RESTful HTTP services, which I’d be willing to bet is also not true.
A number of alternatives have already been mentioned in the comments of my previous posting; feel free to check them out. Thrift would be an option. ICE would be an option, assuming the open source restriction doesn’t actually exist for this case. You seem to dislike CORBA for some reason — have you ever actually used it? — but I’m sure it would work quite well here, and there are several open source versions of it available.
Even if Cisco took an honest look around with eyes open and still couldn’t find just what they needed, I find it hard to believe they’d need to invent their own IDL, their own protocol, and the whole nine yards. They could have reused an existing IDL, they could have adopted an existing protocol. By not doing so, they’re just adding a new set of stuff to be integrated, as I already explained. That’s expensive.
And as if that weren’t enough, Cisco thinks the world needs this new thing so much that they feel the need to standardize it? Give me a break. Even if we actually needed yet another standard in this space, their system isn’t even available yet, but they feel it’s already ready to be standardized? Standardization is best done when something first becomes a de facto standard because it’s proven itself in a variety of scenarios for a number of independent parties. Etch isn’t even close to that and won’t be for at least a couple years, even if it’s the best stuff ever invented.
May 25th, 2008 at 1:04 am (#)
Steve,
Sure, Cisco doesn’t have to use something that’s open-source. But if the alternative that you’re proposing is not open-source (or doesn’t have open specifications), I think the argument that they should have reused instead of writing their own gets a lot weaker. What good to the world at large is a proprietary request/reply system? What harm has Cisco brought to the world by developing one that they plan to open-source?
As I mentioned, “RESTful HTTP services” is not an example of software, it’s a pattern. It’s comparing apples to oranges. It’s impossible to make meaningful comparisons between Etch and REST because they’re different sorts of things.
I’m very surprised to hear you recommend ICE. How can you support ICE but criticize Etch? These two projects are doing exactly the same thing. In this very comment you say “I find it hard to believe [Cisco would] need to invent their own IDL, their own protocol, and the whole nine yards.” But this is exactly what ICE has done! Why is it OK for Michi Henning to write ICE but blame-worthy for Cisco to do the same? Especially since Cisco plans to release under a more liberal license than GPL, which will promote wider adoption and therefore better interoperability than something like ICE.
Furthermore, ICE does things that you explicitly argue against in this mailing list post, like using an IDL.
As for Thrift, it was released just over a year ago, and I’d bet dollars to donuts that if Cisco is talking about open-sourcing Etch now, they’ve definitely been working on it for longer than a year. It will be interesting to compare these two and see which one is more compelling.
I am also surprised to hear you suggest CORBA. I know you wrote the book, but given your position against RPC CORBA seems to be exactly what you are arguing against. And in the mailing list post, you seem to lump CORBA in with all the things you criticize about RPC. I don’t quite understand your position here: if you’re not against CORBA and you’re not against ICE, what sort of RPC are you against?
My beef with CORBA is that it is way, way too complicated and heavyweight for what most applications need. I was briefly at a company that used it, and while I didn’t get too deep into the CORBA code the little I saw of it strongly reinforced this belief. Complexity in systems you depend on is a major liability, because if something goes wrong (even if it’s your own fault) you have to dig into the complex mess to figure out the problem. There’s no such thing as a black box when it comes to systems you depend on.
Maybe it’s premature for Cisco to talk about standardizing Etch, but IMO we desperately need a decent standard in this space (de facto or de jure). The open-source world has nothing reasonable to turn to in this space, though ICE looks pretty awesome and would be a great contender if it were more permissively licensed.
May 25th, 2008 at 10:29 am (#)
@Josh: let’s go through your questions:
The better question is why are they wasting their money developing their own? As for harm, introducing yet another package means dilution, and dilution means no clear winner, and when there’s no clear winner, that means different users end up using different systems that can later result in the need for integration when two different packages meet due to merger, acquisition, enterprise reorganization, need to integrate with purchased apps, etc.
No, it’s not a pattern, nor is the comparison meaningless. I specifically said “RESTful HTTP” because that makes it concrete. REST is a style, HTTP is an implementation. The beauty of RESTful HTTP is that you don’t need any big framework underneath like you do with RPC systems. I’ve architected, designed, and built a number of RPC-oriented systems over the years, starting in the 80s. I chose to leave my previous job as the company’s long-time chief architect because I couldn’t get them to go down the REST route — that’s how strongly I believe in it, and how sure I am of its effectiveness. Not surprisingly, my work in my new role has proven that choice to be correct.
Given your arguments, my guess is that you’ve never written a real RESTful HTTP application. You should do so, and with an open mind. I might be mistaken, but I think I saw in the conversation in your blog entry that you have four years of professional experience? If so, you’re much too young to already be set in your ways like most REST critics are. If you want the advice of someone who’s been doing this for about a quarter of a century, you should be exploring and learning all that you can at this stage of your career, with your mind open and receptive to new ideas and new ways of thinking, because most people find that such learning only gets more and more difficult as time passes. Shutting yourself off from viable approaches at this point will only noticeably restrict your options in the future.
I know quite a few who have left RPC-oriented systems behind and gone over to REST. I know of none who willingly went back.
Because ICE has already been around for many years, and it was designed by very capable people known to the industry and known to have learned and learned well from the dark corners of previous attempts. For all I know, the Cisco stuff was designed and built by people with little to no industry experience. The fact that they felt the need to invent their own could very well indicate that they don’t care to learn from previous attempts, which means Etch might contain many classic mistakes.
Umm, Josh, you asked me what RPC packages Cisco could have used instead (as did Klacke in the previous posting), so I answered that question. Like I keep saying, if you really have to use one, then ICE is a good one to use. But as all my writings and even the paragraph above make very clear, I’d recommend RESTful HTTP over RPC any day. There are no inconsistencies or contradictions here.
See above — you asked what RPC package Cisco could have used instead, so that’s what I answered.
Well, as you might imagine, I’ve been around quite a bit of CORBA code, and the average CORBA application is much less complicated than all the rumors would have you believe. As I always say, don’t criticize it until you’ve actually used it yourself to write an actual system.
Like I said in my original posting, those who don’t know history are doomed to repeat it. Perhaps there’s no standard in this space because, as all the critique of the approach that I and numerous others have written over the years explains, it’s ultimately A Bad Idea™ and there are better approaches available. But if you insist on one, perhaps you should direct your energies toward the ICE guys to get them to provide more favorable licenses, or even open up their sources.
May 25th, 2008 at 1:04 pm (#)
Steve,
As far as the Cisco thing goes, my opinion is that if they’ve done a good job and they’re willing to be liberally licensed, kudos to them. I think this design space will continue to be fragmented until there is a solid, liberally-licensed piece of software in this area. Imagine if something like TCP/IP only had a GPL’d implementation with a company that wanted a cut on every commercial application. Would you criticize a company that introduced a liberally-licensed alternative? It is liberal licensing of a solid product that encourages everyone to converge. A GPL’d implementation of a solid product isn’t the same.
RESTful HTTP is not software. I can’t download, compile, and develop with it. A printout of RFC2616 cannot serve my traffic. I can’t run performance tests on it. I can’t analyze how much work (in practical terms) it takes to accomplish my use cases. And I can’t reason about how well RESTful HTTP promotes interoperability — one of the most important criteria (IMO) of any communications system.
You could mention an implementation of HTTP like Apache, and we could have a meaningful conversation about that. But even Apache alone isn’t enough, because Apache will only handle the HTTP part and I’m still left deciding what my on-the-wire format will be. Even if I choose an on-the-wire format like JSON (and choose an implementation of JSON) I still need more code to actually dispatch incoming requests to my request handlers. And if you’re making the traditional “caches, proxies, and tunnels, oh my!” arguments that REST advocates make, you’ll need yet more software to actually give you these things.
All I’m saying is that to make concrete evaluations of the technical merits of X vs Y, you have to actually have concrete implementations of X and Y. In this case:
X = RESTful HTTP (Apache + JSON + dispatching code + Squid + ?)
Y = ICE
(I use ICE here because we both have respect for it as a reasonable implementation of request/reply (I’ve only looked at it for about 20 minutes but it seems very reasonable), and it’s open-source which means we have enough information to evaluate it concretely).
ps. you indicate in your last message that you consider ICE an RPC system, and yet it does not do one of your most criticized aspects of RPC — hiding network failures. For example, ICE appears to surface both timeouts and marshalling/demarshalling errors to the user. As a result, I’m still confused about your definition of RPC.
May 25th, 2008 at 2:05 pm (#)
@Josh: yes, that’s one of the wonderful features of RESTful HTTP: I don’t have to name any particular implementation, because you can use whatever you want. I didn’t know this was suddenly turning into some sort of bake-off. Use Apache if you like; I’ll use Erlang and Yaws. They’ll interoperate, as the web clearly proves. You have a huge number of implementations to choose from, and finding one that best fits your problem is not at all difficult.
But it’s clear that you don’t like REST, so don’t use it! I honestly couldn’t care less. It’s your loss, and it’s unfortunate that you make choices like that without being fully informed. Unlike you, I’ve built multiple significant systems with both approaches, and I’m quite comfortable with my choice, thanks. I’m not going to waste further time arguing with you about it.
As for your final comment, I think I’ve provided very clear definitions of RPC. I think you just like to argue, and I know baiting when I see it. I always encourage readers to post comments here that help everyone reading, myself included, to learn, but non-constructive comments will be denied.
May 25th, 2008 at 3:18 pm (#)
Steve,
It can be hard to tell the difference between good-faith and bad-faith discussion over electronic mediums (especially when you don’t know the person already), so I can’t fault you for thinking that I’m just being antagonistic. Really I’m trying to get to the bottom of this, but I think we’ve gotten to the point where we’re talking past each other. Since you don’t wish to discuss this further here, I’m happy to close this discussion out and make further arguments on my blog.
Sincerely,
Josh
May 25th, 2008 at 3:43 pm (#)
@Josh: get to the bottom of what? IMO we’ve already gone through the bottom, and now it just seems that you’re trying to argue with me on new topics just for the sake of arguing. But if you say that’s not true, I’ll give you the benefit of the doubt and try to address your question.
Sure, ICE, like CORBA and other systems, allows access to network-related issues. CORBA does this through exceptions that can come back to the caller from the server side or from the ORB on either side; in a language that supports true exceptions, the application will have to catch them and deal with them. Does that mean these systems are not RPC? I think it’s a gray area that can be argued either way. If you look at the code, it looks just like any local code, so from that point of view, it’s RPC. However, since it’s not necessarily trying to hide network effects and is making the programmer deal with them, then from that point of view, these systems are better in that regard than traditional RPC systems.
HOWEVER, and this is a big however, whether or not these systems conform to some strict definition of RPC is not really the important issue; rather, the main problem is that the RPC-oriented approach in general brings with it a number of problems that go well beyond this. And I’ve already written about those issues in my postings and publications — issues like coupling, which is huge, impedance mismatches between IDLs and language mappings, versioning, reuse, etc. In answering your question, I’m therefore starting to repeat myself, which is why I said above that we’ve already gotten to the bottom of this.
May 26th, 2008 at 10:55 am (#)
“I’m very surprised to hear you recommend ICE. How can you support ICE but criticize Etch? These two projects are doing exactly the same thing.”
In my opinion, this is also an excellent reason why they shouldn’t have done Etch: if ICE is exactly the same thing, done by industry experts veterans and well-versed in the previously existing RPC systems, then they shouldn’t have made their own, and used ICE instead.
After all, it is exactly the same!
June 1st, 2008 at 11:19 am (#)
I think should start way back, this new thing is for Cisco Unified Application Environment..
How many people here have built a application on it or even know when to select the Cisco Unified Application Environment over the Cisco Unified Customer Respond Solution?
Does anybody know any other products at Cisco that are making use of “Etch”? I know their many products making use of SOAP.
Just my 2great british pennies
June 9th, 2008 at 1:10 pm (#)
[…] Vinoski on the need or the lack of creating new frameworks and tools for RPC Burc Oral Gorillazation of SOA, on why SOA need not be as […]