Protocol Buffers: Leaky RPC
July 13th, 2008 | Published in distributed systems, RPC, services | 72 Comments | Bookmark on Pinboard.in
Mark Pilgrim tells us why Protocol Buffers are so nice. Notice, though, that everything he writes focuses entirely on their form and structure as messages. If you focus only on that perspective, then sure, they’re better than what many could come up with if they were rolling their own. In fact, if Google had stopped there, I think Protocol Buffers could be a superb little package.
But they didn’t stop there. No, they had to include the fundamentally flawed remote procedure call. Well, sort of, anyway:
The Protocol Buffer library does not include an RPC implementation. However, it includes all of the tools you need to hook up a generated service class to any arbitrary RPC implementation of your choice. You need only provide implementations of
RpcChannel
andRpcController
.
Why ruin a perfectly good messaging format by throwing this RPC junk into the package? What if I want to send these messages via some other means, such as message queuing, for example? Do I have to pay for this RPC code if I don’t need it? If my messages don’t include service
definitions, do I avoid all that RPC machinery?
In my previous post I talked about the message tunneling problem, where data that doesn’t fit the distributed type system are forced through the system by packing them into a type such as string or sequence of octets. Since Protocol Buffers require you to “hook up a generated service class to any arbitrary RPC implementation of your choice,” it’s likely that you’re going to run into this tunneling problem. For example, if you want to send this stuff over IIOP, you’re likely going to send the marshaled protobufs as Common Data Representation (CDR) sequences of octet. You’re thus unavoidably paying for marshaling twice: once at the protobuf level for the protobuf itself, and then again at the CDR level to marshal the sequence of octet containing the protobuf. Any worthwhile IIOP/CDR implementation will be very fast at marshaling sequences of octet, but still, overhead is overhead.
But there are other problems too. What about errors? If something goes wrong with the RPC call, how do I figure that out? The answer appears to be that you call the RpcController
to see if there was a failure, and if so, call it again to get a string indicating what the failure was. A string? This implies that I not only have to write code to convert exceptions or status codes from the underlying RPC implementation into strings, but also write code to convert them back again into some form of exception, assuming my RPC-calling code wants to throw exceptions to indicate problems to the code that calls it.
What about idempotency? If something goes wrong, how do I know how far the call got? Did it fail before it ever got out of my process, or off my host? Did it make it to the remote host? Did it make it into the remote process, but failed before it reached the service implementation? Or did it fail sometime after the service processed it, as the response made its way back to me? If the call I’m making is not idempotent, and I want to try it again if I hit a failure, then I absolutely need to know this sort of information. Unfortunately, Protocol Buffers supplies nothing whatsoever to help with this problem, instead apparently punting to the underlying RPC implementation.
Still more problems: the RpcController
offers methods for canceling remote calls. What if the underlying RPC package doesn’t support this? Over the years I’ve seen many that don’t. Note that this capability impacts the idempotency problem as well.
Another question: what about service
references? As far as I can see, the protobuf language doesn’t support such things. How can one service return a message that contains a reference to another service? I suspect the answer is, once again, data tunneling — you would encode your service reference using a form supported by the underlying RPC implementation, and then pass that back as a string or sequence of bytes. For example, if you were using CORBA underneath, you might represent the other service using a stringified object reference and return that as a string. Weak.
All in all, the Protocol Buffers service
abstraction is very leaky. It doesn’t give us exceptions or any ways of dealing with failure except a human-readable string. It doesn’t give us service references, so we have no way to let one service refer to another within a protobuf message. We are thus forced to work in our code simultaneously at both the Protocol Buffers level and also at the underlying RPC implementation level if we have any hope of dealing with these very-real-world issues.
My advice to Google, then, is to just drop all the service
and RPC stuff. Seriously. It causes way more problems than it’s worth, it sends people down a fundamentally flawed distributed computing path, and it takes away from what is otherwise a nice message format and structure. If Google can’t or won’t drop it, then they should either remove focus from this aspect by relegating this stuff to an appendix in the documentation, or if they choose to keep it all at the current level of focus, then they should clarify all the questions of the sort I’ve raised here, potentially modifying their APIs to address the issues.
July 16th, 2008 at 5:36 pm (#)
>> Protocol Buffers when used with RPC does not mask RPC
>> problems”
@Dhananjay
Take a look at Kenton Varda’s reply a few comments above. It goes like this:
“The Protocol Buffer RPC support is, in fact, used for practically all inter-machine communication at Google. We were unable to make our RPC system part of this release, but we have one, and it works on top of exactly the RPC stubs that are in this release. We certainly are not going to remove this support. We rely on it.”
Do you still think Steve’s post is misplaced?
July 16th, 2008 at 7:44 pm (#)
@steve, I’ve taken a day to mull over what you’ve written and read back over some of the arguments that have been presented in these comments and in other posts. I still have a fundamental problem that I disagree on what RPC means and I think others have similar issues. However, that disagreement is not really getting to the core of the issue. This would probably get solved faster over a few beers (even if a few people had a bag over their head with “anonymous” written on it). I really want to get past the word RPC and make sure we agree on the meaning.
I’ll try and link this problem back to a concrete example to explain my position better. Let’s say I’m a developer in an organisation that has been given a library that has a single method, int createOrder( Order ). It takes a complex object as the argument and returns an order number. I’ve been told to go and expose this as a client interface so that it can be used across the organisation. They want low latency, so it means I can’t use any queuing and the client will block waiting for the response. The other developers in my team want to use what ever I build on the client. The other developers want the convenience of calling a single method to perform the action. It could be int sendOrderToServer( Order ) throws RemoteException; or something similar. The other developers in my team know full well that this is not a local operation. They know that they need to handle any exceptions that are due to timeouts, etc, however they still want me to deal with those implementation issues. They will build the rest of the software to recognise any exceptions and deal with them appropriately.
As far as I’m concerned, it doesn’t matter how you implement sendOrderToServer, the result is that sendOrderToServer is a remote procedure call. It is bound by requiring a blocking synchronous request/reply semantics. It doesn’t matter if sendOrderToServer is implemented using REST, CORBA, Protocol Buffers or Colony. The end result is that my manager wants me to deliver a convenient interface to the remote library. So my first question to you is; have I already been bitten by the RPC bug and convenience over correctness? Second, If this was done with REST, is it actually just RPC over HTTP and not REST?
Its my view that all the code and protocols from sendOrderToServer all the way to the library method createOrder is inclusively labelled a Remote Procedure Call. It doesn’t matter if that is hand coded, generated, or uses reflection; the full solution is part and parcel of the RPC. As long as it meets the blocking synchronous request/reply semantics it’s RPC.
It’s my view that CORBA and similar products are “monolithic RPC” solutions. They attempt to provide a convenient and quick solution for every aspect of the problem for the developer. I view REST as a “decomposed RPC” solution. For REST, I choose the encoding, the location and various other aspects. It’s up to me as a developer to go and implement many parts of the RPC. I’m pretty sure you’ll say I’m wrong on these definitions, so if you’ve got a better ways to describe them… I’m all ears.
btw. I like the green. It just changed colour between previews. :)
July 16th, 2008 at 9:22 pm (#)
In my experience idempotency is not important because it’s the *default* status for an RPC call. Every RPC call used in the software I work on is idempotent (this is a major consumer website). Requests for an ad to display, searches, news queries, and checking for new RSS stories are all idempotent operations. Non-idempotent operations are handled via async message queues and direct SQL access.
Google, Facebook, et. al are an existence proof that RPC is a successful solution for certain problems in large scale systems. The more interesting question is what kinds of problems is it useful for and which is it not.
July 16th, 2008 at 11:06 pm (#)
@Adam: I can assure you that a purely idempotent RPC system is quite unusual. This is no exaggeration: you’re the very first person I’ve encountered in 20 years of building such systems who has ever made such a claim. I’m not saying that your description is not accurate — I’m simply saying that it’s very unusual, especially in enterprise integration scenarios.
@David: RPC is RPC. RFC 707 is pretty darn clear on why RPC was conceived. I don’t see how continuing to try to Humpty-Dumpty its definition is going to provide any useful insights.
In your most recent comment you are making the very mistake that I normally point out at the start of any RPC explanation, which is that blocking synchronous request/reply == RPC. No, no, a thousand times no. RPC is not purely about networking operations, but rather, it’s a view of such operations from the programming language perspective. As RFC 707 says, it’s a model “that encourages and facilitates the work of the applications programmer by gracefully extending the local programming environment…this integration of local and network programming environments can even be carried as far as modifying compilers to provide minor variants of their normal procedure-calling constructs for addressing remote procedures.” That ‘P’ in the middle of “RPC” refers to programming language procedures/functions/methods, not to some generic use of the term “procedure” in the sense of “a series of actions conducted in a certain order or manner” (definition courtesy of the dictionary on my Mac).
REST is absolutely, definitely not RPC. REST is a well-defined architectural style that has nothing at all to do with programming languages or programming environments. The fact that REST promotes in part a client-server request-reply approach does not make it RPC. @anonymous took you to task for your previous comment because it got some fundamental aspects of REST very wrong, and calling it RPC is very wrong as well, so I’m not sure how s/he will react to this latest gaffe! ;-) Please do yourself a favor and sit down and read Fielding’s thesis — it will change your understanding of all this stuff for the better. It is one of the very best documents on distributed systems I’ve ever read.
I could go on and on, but I won’t. I don’t know how many times I can cite the same sources and explain the same things over again.
July 17th, 2008 at 12:29 am (#)
Alright, I’ll go off and read the Fielding’s thesis. I already hunted it out earlier today. Honestly, I am just trying to find a language to describe my views and am not trying to continue a Humpty-Dumpty approach to calling things what I want. I see what you’re saying with regard to P in RPC. I’ll stop using the word RPC for fear on inciting violence.
As I said, if you’ve got a better way to describe the general concept of blocking synchronous request-reply semantics then I’m all ears. Can I say that RPC and REST both contain the ability to remotely call methods? How do I talk about these approaches at the level I said above. I have a method on the client; int sendOrderToServer( Order ) and on the server it calls a method int createOrder( Order ). I want a general way of talking about this end to end process without bringing up discussions of nursery rhymes.
July 17th, 2008 at 8:35 am (#)
Geez, you know a discussion is going bad when people start quoting scripture at each other :-)
Regardless of the definition you use for RPC, it is fundamentally an abstraction devised to assist in distributed computing. There are many abstractions we use every day in IT and some of them work well – others do not. Object-Oriented programming is one such abstraction which works well in domain modeling and in most every-day applications. But in some areas such as user-interfaces or distributed systems, objects don’t work well as an abstraction.
For many years Power Programming with RPC was my bible and I think I’ve got a pretty good handle on what it was about. RPC was originally devised as an abstraction of procedural programming as applied to distributed systems. Later with DCOM and especially CORBA the distributed systems abstraction became “method calls” so that OO programmers would feel at home.
The problem with RPC when used as an abstraction is that it promotes tightly coupled systems which are difficult to scale and maintain. That is the lesson of 20 years of distibuted systems development. One problem with “out of the box” Web-Services is that it continues the RPC abstraction.
Other abstractions have been more successful in building distributed systems. One such abstraction is message queueing where systems communicate with each other by passing messages through a distributed queue. REST is another completely different abstraction based around the concept of a “Resource”. Message queuing can be used to simulate RPC-type calls (request/reply) and REST might commonly use a request/reply protocol (HTTP) but they are fundamentally different from RPC as most people conceive it.
So my point is that RPC is generally frowned upon because of its architectural implications. I try to avoid it in my line of work. There are some cases where it is useful, but like many of these things – caveat emptor.
July 17th, 2008 at 8:45 am (#)
[…] You know a discussion is going downhill when people start quoting scripture at each other :-) Here is my […]
July 17th, 2008 at 5:15 pm (#)
gimme a few free bears and, forget the bag, I will dance naked!
Great! I view myself as Brad Pitt in the mirror everyday!
I will try and come up with a better idea for you here.
You can do wahtever you fucking want . It doesn’t matter. REST doesn’t give a tiny rat’s ass what you do at the server. Although remember that just using RESTful RUBY doesn’t promise scalability if you fuck up the design at the server. REST is an interface model. Repeat again , 100 times!!
(italics/bold mine)
The whole idea behind having an interface model is that you can do whatever you want. Ofcourse , a dumb person (by mapping all function calls to URIs for e.g (sound familiar?) ) can screw it all up and make a ass hat of a system. But REST atleast allows the smart person not to screw it all up. The smart person with REST can make a kick ass application. REST atleast gives you the choice, RPC doesn’t.
Read the whole post from where I flicked it : Stu’s Blog
It has all the cool words that people use in companies (Zachman something something for e.g) , so maybe you will understand it .
I always wondered how people got insane performance increases by just using HTTP and designing URIs correctly.
Thanks for clearing it up for me!
And when you think you have understood it, read it again. The first time I read it, I was like “Hmm , interesting” .. only after some rereads did the whole “WOW!! OMFG” thing come in.
(cheap attempt at humour)
No no no, you got it all wrong, REST requires you to use a uniform interface .. English is fine by us. You can’t just go about finding and using languages you dig up.
(end cheap attempt at humour)
whats that gotta do with REST? You can do such dumb things with REST too. Block on every HTTP request till you get the response .. have an awesome time with that application!! Hell, while you are at it , change your OS to “synchronously” block for every I/O request. That would be damn easy to program for!
I won’t even care to repeat myself.
(read the part that I asked to ignore).
July 17th, 2008 at 7:37 pm (#)
@soabloke, thanks for stepping in and providing a higher level view. Between you and the first sixty pages (all I’ve read so far) of Fielding’s dissertation, I think I’m finally starting to see Steve’s/REST’s point of view. I also watched Steve’s video presentation he did on Infoq. Its also worth a watch.
I’ve been burried in the detail of how to implement these distributed computing problems and couldn’t see the forest for the trees. (Steve, Notice I did not say RPC :). It’s interesting that I’ve never separated “architectural styles” from “guiding principles” in my head. By guiding principles I mean all the good ideas that guide how I build distributed systems; many of the decisions are buried in some of the most basic detail. What I think Fielding is doing is taking those guiding principles that people like me have been taking for granted and creating a set of explicit rules for hypermedia systems. As long as people don’t break these rules then they should end up with a good solution (as Anonymous said. I wrote this before reading his response). As I said, I’m only sixty pages in so far, so there’s still a way to go.
@steve, just to see if I understand your position on RPC, I’ll go back to my concrete example. What I think you’ve been saying is that RPCguy (using the same idea as your presentation) receives some library for creating and working with orders. The library has been designed for local use and RPCguy has been told to make it available to a client. RPCguy writes some IDL that looks as close to the library as possible, churns it through a IDLtoCode generator and announces how easy it was. He hasn’t taken into consideration any of the real issues that distributed computing requires.
Now, ORBgirl(I shouldn’t be sexist) comes along and says to RPCguy, “you’ve got no clue! That won’t work!” She says you need to do atleast some OO design and worry about the life-cycle of the objects. She changes the IDL to introduce the concept of creating an object to represent an order on the server. She also has a bit of a think about the interfaces to the Order object. She writes some IDL passes it through a CORBA vendor’s solution and assumes the container will look after everything.
Next comes along SOAguy (btw, nice picture for the SOAguy in your presentation). He say’s, “But CORBA won’t work for the Internet! You have fifty methods on that Order object.. didn’t you even think about the latency issues?” SOAguy gets onto designing a much simpler interface to the library. He also designs an XMLSchema that
allows passing full Order objects back and forth between systems. He writes some WSDL, churn’s it through a code generator, sits back and says “See, that’s how its done!”
Finally(?), comes along RESTguy and says “Yo Fool, this is da Internet Age! You’re still living with RPCguy.. Get with the Hypermedia dude! And btw, you’ve got no style, Architectural Style!” He gets set to creating a set of URLs which allows doing a GET to find an Order. The URL can be used as a reference in other queries. He still uses basically the same XMLSchema as SOAguy, but can return PDFs or anything else. His thinking about the problem from a completely different perspective.
I’ve been looking at all four people and thinking, “I’ve been looking at the guts of all of this and in the end, for computer to computer communications it all ends up in code somewhere”. It all has to map back to calling a method. But I am starting to see where the REST idea is coming from now. It’s the architectural style that if followed correctly should end up a system that scales with the Internet.
There’s still a hole in my thinking which is how does REST get mapped back to OO systems when that’s the requirement. I understand how documents and hypermedia work so well together, but I’ve got this legacy of OO libraries and thinking that I need to map to hypermedia.
As I said, I’ll continue on with reading Fielding’s disertation.. I may even read it twice as suggested by Anonymous. Thankfully its a good read as Steve has said numerous times.
But the big question is… am I on the right track and heading in the right direction?
July 17th, 2008 at 7:50 pm (#)
Sorry, I should have read Stu’s blog on “Understanding hypermedia as the engine of application state” before asking that the question of how OO systems link back to hypermedia. Thanks anonymous, that filled another gap!
July 17th, 2008 at 10:13 pm (#)
yay .. video for the weekend ..
I hope to God this is because you have only read 60 pages. If you still get this impression after reading it through fully, please read it again. Please , I beg of you … cos right now I feel really sorry for you.
Hopefully , Stu’s post also helped in changing that perception.
July 17th, 2008 at 10:48 pm (#)
@David: I’m glad to hear that you’re reading the REST thesis. There are a bunch of folks out there who continually whine/whinge about REST and write reams of prose telling us how terrible it is, and yet they’ve never even bothered to read even the title page. Congratulations on not being one of them. :-)
BTW, today someone tried to post a comment here suggesting that you take your questions elsewhere because I am nothing but a “REST zealot” and so apparently I would be unable to answer them. I didn’t post it because not only is it wildly inaccurate, but it was negative and adds no value. Now, some of anonymous’s comments are also a bit rough and arguably negative, but I’ve passed them through because overall s/he is contributing some useful observations, insights, and links, plus some of the things s/he says are kinda funny. :-)
So yes, I’d say you’re on the right track, anonymous’s latest comment notwithstanding. :-) But like anonymous said earlier, it generally takes a few passes through the thesis, and I’ll add that it also takes some experimentation, before it really sinks in.
July 18th, 2008 at 2:30 pm (#)
@David:
http://www.xent.com/FoRK-archive/may98/0120.html
(and also for all other REST bashers here…)
July 19th, 2008 at 8:48 am (#)
FYI:
ZeroC created an Ice patch that allows one to use Ice as a transport for protocol buffers in C++, Java, and Python.
http://www.zeroc.com/labs/protobuf/index.html
July 22nd, 2008 at 12:21 am (#)
[…] Steve Vinoski has been busy trying to convince the world at large that RPC is “fundamentally flawed”. I think it is interesting to take a look at RPC and see what those fundamental flaws are (and whether there are flaws, for that matter). Doing this will definitely take more than one post, so don’t expect the answers all at once. I will deal with various aspects of the topic over a number posts over the next few weeks, so please bear with me. […]
July 23rd, 2008 at 12:53 am (#)
[…] Protocol Buffers: Leaky RPC Steve has a very tight definition for RPC as per Note On Distributed Computing and RFC 707. Unfortunately most real-world RPC mechanisms do not fit with this definition. Yet he still critiques them as if they did fit. (tags: rpc) […]
July 24th, 2008 at 3:24 am (#)
@Steve & Anonymous. I hope you didn’t think I was finished here. I’ve been off reading the Fielding dissertation on REST. I also read the “Tao of Pooh”. Both were a great read. I suggest you both read the second one sometime.
Before I start (or repeat some of my earlier arguments), I wanted to address what Steve said about me taking my questions elsewhere. I do tend to agree with the person’s remarks you didn’t post. You and Anonymous are coming across as being REST zealots. That’s ok though. You’ve got your opinion, and I’ve got mine. The first step to actually having a real discussion with a people with strong opinions is to understand where they’re standing. To do that, I’ve had to deal with Anonymous standing on his pedestal throwing belittling remarks. Those remarks didn’t achieve much and shows his zealot nature. However, in between those useless remarks I’ve got a glimpse into his point of view and some useful information. If I’ve had to look foolish to get come to a better understanding of these topics then I’m ok with that.
Also, it’s good to see Michi joining the discussion. :)
OK, First off, let me repeat that I really enjoyed reading Fielding’s dissertation. The ability to evolve the architectural style of the web was ingenious. His method of creating an architectural style is a step above systems architecture. I think a lot of architects do develop architectural styles in their projects, however, most of the time they would be implicit in designs. Fielding does a great job of making the concepts explicit. Obviously an easy pot shot at every other solution to distributed computing is that they have not been developed this way. However, this is not to say they can’t be retrofitted in the same way REST was.
Next, REST is fundamentally not RPC. REST is an architectural style that is designed to ensure that the web’s hypermedia solution to distributed computing will not be ruined by future changes. REST is not a design pattern or an implementation. You could look at the actions of REST and loosely suggest as I’ve done in the past, and Michi has, that they have some similarities to RPC. I don’t think it is an argument worth pursuing. This does not mean that the REST architecture doesn’t look like RPC on the client, but I’ll get to that later. REST is as different from RPC as it is from Message Queuing or Publish Subscribe systems.
An important point is that REST is an architectural style for an open hypermedia system. REST is designed to ensure that new additions to the Web architecture don’t remove any of the positive traits that were carefully designed into the system. There is no claim that REST provides any of the architectural style required for point to point system to system communications. REST also isn’t designed for the security conscious world where point to point solutions are required and caches provide no benefit either. It is designed for one purpose; to define the open web hypermedia system.
Hypermedia and the web is a fantastic solution for human/browser to computer communications. However, it offers only part of the solution in the area of computer to computer communications.
I’ll go back to one of my earlier points. A blocking synchronous request/response semantic interaction starts when a local procedure is called on a client. It finishes when that local procedure call returns with a valid result or exception information. From the developer building these systems, RPC and REST look the same; the entry point is a local procedure call.
Adapting REST to the problem of system to system interactions provides little help to the developer. REST provides an architecture for the Session layer of the communications system. The developer must choose the Presentation layer (mime type), and then must encode that information they need into the Presentation to create their own Application layer. In system to system distributing computing problems, all the same issues are there that RPC/SOA/ORBS have, without the mature tools to assist the developer.
The problem is that there’s a disconnect between a developer working in an environment where they are making local method/function calls and REST which offers a hypermedia solution. This creates a situation where a developer is trying to provide a local library which is OO based and bolt it onto a hypermedia solution.
The REST approach to architectural style can obviously benefit system to system communications. There are a couple of things that REST provides which I’ll probably take away and explore. The first is a separation of the request parameters from the underlying request structure, to allow any data to be sent to the server. I’d actually already thought about doing this previously but hadn’t got around to it. I was going to put the data structures here to show how I’d make my Colony solution behave more REST like, but that’s probably a little to far off topic. I’ll do that on my Blog sometime.
The other part of REST I’d like to bring back to Colony is the ability to specify caching behaviour in the response. Colony is all binary so this would need to have Colony specific caches built; however, the design is a good one. Once again, I’ll put up the structures on my Blog sometime to show that.
So, I’m still looking for words to describe the group of all blocking synchronous request/response calls that the client makes in system to system communications. In these situations it doesn’t matter what implementation is used, the client still makes a blocking local functional call which blocks and eventually returns with a result from a server or cache. I’m told I can’t call it RPC because that has an ideology of ignorance; I don’t want to call it SOA because that has its own ideology. It’s definitely not an ORB. What is it?
Just to throw in another example. In Colony I’ve built two different implementations of calling a server. One uses a simple data structure with location, method id and parameters. The second uses a mini virtual machine using a simplistic byte code. The VM has a heap, stack, byte code and program counter. It can be used to call multiple methods over multiple machines. This is obviously not RPC; so what do I call it? I can build the same interface and call it in exactly the same way, yet the actions they perform on the client and server are completely different. It still fits into the class of blocking synchronous request/response semantics, yet I haven’t got a generic name to call it.
Steve, one of the things you’ve said you’re trying to achieve is make people aware of other solutions to distributed computing. I’m now aware of what REST is, and more importantly, what it isn’t. However, to have some more meaningful conversations I’d like to have the words to categorise and dissect the various solutions to distributed computing. If you’re saying I’m using the wrong words to describe things, please give me the right words. In particular how do we talk about the class of problems associated with system to system communications which involve blocking synchronous request/response semantics?
July 25th, 2008 at 6:07 pm (#)
Man my finger is sore from scrolling through all these comments!
Seriously, great stuff Steve. I wrote a blog entry about taking into account the human element in software abstractions – i.e. the impact on productivity and development cost by making it easier for people to do things like distributed computing. I would argue that sometimes this is more important than purely technical considerations such as performance or clean code.
But I completely agree about the main point – which as I understand it is being sure to use the right tool for the right job, and not just blindly using RPC because it’s convenient.
I have a couple of other more detailed comments that I thought I’d add to the discussion.
On the definition of RPC – using RFC 707 is a bit like using the SOAP specification to define SOAP. I mean that there’s often a difference between theory and practice, and what we complain about with SOAP (myself included, despite the fact people are getting value from it) is that the spec isn’t implemented correctly. If you read the SOAP and WSDL specifications – the latest versions, especially, you would think they are very RESTful. But no one implements the RESTful bits – they tend to focus on the RPC style (as you have said, IONA has been as guilty of this as anyone despite our efforts to lobby for implementing the document oriented style).
I don’t want to go into a huge digression on this, but to me this is a great example of a kind of innovator’s dilemma, or a side effect of one. When we talk with our customers about SOAP, WSDL, etc. they tend to say something like “if you want me to use that, it has to be just as performant, reliable, secure etc. as what we already have.”
This of course entirely misses the point, since what’s important are the application requirements, not the technology used to develop the application. As you have pointed out many times, a RESTful approach could as easily meet many if not all enterprise application requirements. But (as you have also pointed out) this would require a change in thinking that a lot of people seem unwilling to tackle.
Another minor comment – the inability of the industry to solve the data type mapping problem does not in itself mean that the RPC mechanism is useless. It is true that interoperability decreases in proportion to the complexity and number of data types involved, but that doesn’t mean RPCs aren’t useful.
It is interesting to read about Erlang, REST, and explicit programming for distributed computing as a kind of historical advance. When RPCs first came out, we viewed them as a technological advance over the dominant style of the day, which was P2P (LU6.2 was the leader – and man, no one wanted to program that if they could avoid it).
But I also take the point that you are going to get better results for many types of distributed applications by explicitly programming. And I am also very impressed by what I’ve been reading about your Erlang work.
I don’t really think of RPC = transparent distribution. I think of RPC as a programming model. By definition people know they are doing remote calls because they have to create some kind of interface definition, compile proxies and stubs, link them into their applications, etc.
When I think about the big picture of distributed computing, it seems like there will always be some number of applications for which RPC is a better fit, and some number of applications for which asynchronous messaging is a better fit. One of the big problems I think we all have is that there is so much overlap in what can be done using either approach. I have a rule of thumb for this that depends on the significance of the reply. If the reply to a message needs to indicate whether or not the database update was performed, for example, RPC is a better fit. If the reply simply needs to indicate that the message was received, then asynchronous messaging is a better fit.
I know the above is kind of impressionistic, and not very precise. I suppose the point is to think about the tradeoffs and not just try to use one or the other for everything.
July 27th, 2008 at 7:57 am (#)
After so many posts,O.K. We reach here: Depends, trade off as all of arguments. :)))))). But we indeed learn much from many of the posts.
July 27th, 2008 at 7:07 pm (#)
Standing on a pedestal – yes I do that sometimes …
belittling – they were intended to be fun , sorry if you found them belittling …
talk about belittling … but I don’t really care … although I would disagree with you calling me a zealot, I wouldn’t care to argue why …
style (not his method) is a step above architecture – read the original paper documenting what Style is – garlan and shaw (and those guys were the first to study software architecture )
You had earlier argued exactly the opposite iirc, so are we zeolots for asking you to just RTFM ?!?
Thats how fielding intended it. Fielding when he first came out with REST only intended REST to be for information services , which when you think about it are a HUGE part of services and are VERY important – if your information services are hidden/pain in the ass to use – your employees are gonna be strait jacketed. So first please don’t “belittle” information services.
Second, it was only when everyone got us into WS-MESS that saner heads came together and said – look at that REST thing out there it seems to work nice for me , I am going to use it. This is not to say REST is complete and ready to use NOW . But as Martin Fowler/Jim Webber presented at infoQ, it is more like using the Agile method / evolutionary approach to distributed systems rather than the “intelligent design” that WS-* seems to favour.
The whole OSI 7 layer model is to me a pain to understand / use – I much prefer the TCP/IP 4 layer model. I can’t really reply to what you have written as I have long ago forgotten what these Session layer / presentation layers are – and I don’t see much point to the argument if I even find out.
As far as I can see, the only tools that developers really use (in WS-*) are the WSDL(with SOAP ofcourse) ones . Thats ALL. No one uses the other junk load of WS-* specs/ tools that vendors have come up with and are trying to sell. And as a result, all those most certainly aren’t mature.
Barriers ? :)
August 2nd, 2008 at 8:41 pm (#)
The problem I have with protocol buffer is that it mislead people on XML’s performance characteristics.
XML doesn’t have a performance issue. The issue belongs to XML parsers. I have written an article on this…
http://soa.sys-con.com/node/250512
August 24th, 2008 at 3:14 am (#)
Steve,
Your message tunneling problem is covered in the classical systems design theory paper, End-to-end Arguments in System Design by Saltzer et al.
Take care and best regards,
Z-Bo