forked from imatix/restms
-
Notifications
You must be signed in to change notification settings - Fork 0
/
restms_background.txt
174 lines (98 loc) · 18.3 KB
/
restms_background.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
This document is an introduction to RestMS, the RESTful Messaging Service. RestMS provides web applications with enterprise-level messaging via an asynchronous RESTful interface that works over standard HTTP/HTTPS.
* Author: Pieter Hintjens <[email protected]>
++ License
This document is licensed the Creative Commons Attribution Share-Alike (cc-by-sa) License version 3.0 or later.
++ Messaging for a Connected World
Wikipedia [http://en.wikipedia.org/wiki/CORBA describes CORBA] - arguably the principle messaging standard of the last century - as "//a standard [...] that enables software components written in multiple computer languages and running on multiple computers to work together.//" This is an excellent problem statement for messaging.
It's been almost two decades since CORBA was launched, and the connected world has utterly changed in that time. As the technology of connecting people, machines, and applications together has doubled in power and halved in cost every two years or so, so have the challenges of scaling and security grown by ten orders of magnitude.
In this article I'll look at one specific aspect - messaging for Web applications, and specifically the [http://www.restms.org/ RestMS] web messaging protocol that a small group is developing. To set the stage, I'll look at how messaging - the problem as well as the solution - has evolved since 1991, and how RestMS fits into a vision what messaging will look like in another 20 years.
++ History and evolution of messaging
CORBA was fairly unique in being a deliberate //standard// rather than a product. Messaging is still dominated by products rather than standards and CORBA was ahead of its time in attempting to lay down rules for interoperability and that perfect supplier economy that a free and open standard enables.
In 2002 de Jong et al [http://www.xs4all.nl/~irmen/comp/CORBA_vs_SOAP.html noted five main things] wrong with CORBA, in favour of [http://en.wikipedia.org/wiki/SOAP a new standard called SOAP]: "//CORBA... is too complex... too expensive... does not work through firewalls or proxy web servers... isn't being hyped by IBM and MS//" and is not human-readable.
My own view is that CORBA failed because it became a vehicle for vendors: too easy to capture, and in the end what customers really wanted was interoperability and choice of vendors, rather than functionality as such.
There are not many messaging standards that aim at the whole stack. After CORBA, SOAP is perhaps the best known one. Based on [http://en.wikipedia.org/wiki/XML-RPC XML-RPC], the "simple object access protocol" became something much larger and soon landed in exactly the same place as CORBA: too ambitious, too complex, too dominated by large vendors. Today SOAP lives only in the ghetto of Enterprise IT.
Given free choice, it seems many more people still prefer XML-RPC over SOAP. Compare the two Wikipedia pages and it's clear which has the more thriving community. Similarly, Christopher Browne [http://linuxfinances.info/info/corbaalternatives.html lists one link] for SOAP and dozens for XML-RPC.
One lesson that seems to shine through. The technologies that succeed are the ones which are standardized, open, simple, cheap, and human readable. Being HTTP-friendly is a plus. Being hyped by IBM and MS is not.
What is striking is how few attempts there have been to design messaging protocols. Browne's page lists numerous proprietary products but practically no standards that would permit such products to interoperate. Or rather, would permit clients to freely choose vendors. He does list one, the "[http://www.amqp.org Advanced Message Queuing Protocol]".
++ What changed?
In 1991, I released my first FOSS product, [http://legacy.imatix.com/html/libero/index.htm Libero]. I sent out packages by email to people who asked for it. My email account was a dial-up to a Unix shell. It cost 1,000 Euro a year and ran at 33.6Kb.
In 1991, there were still fears of [http://en.wikipedia.org/wiki/No_Silver_Bullet a global software shortage], which were calmed only around the turn of the century when it became obvious that software had become as cheap as ice.
In 1991, it was still much more expensive to write a piece software than to connect it to another one. Messaging, like databases, was just a small part of the cost. Who cared if CORBA was expensive and complex? Everything was, in 1991. And don't get me started on 1981... :-)
Today, it is become trivial to write sophisticated applications: better abstractions, better tools, and above all access to oceans of reusable components that are at a finger's reach, across a global Internet.
Suddenly if messaging is not also //trivially// cheap, it becomes a serious problem.
++ Future of messaging
It seems clear the future of messaging lies in free and open protocol standards. Closed products are so expensive and slow to evolve that they cannot compete with the new generation of FOSS products based on free and open standards. CORBA was right, just fifteen years too early, and alien to the emerging Internet world.
What kind of standardized protocols can we expect to see emerge over the next years?
Although it seems counter-intuitive, I believe we'll see open standards that are simpler, not more complex. "Simpler" could of course mean "just doing less", but to me it means "better abstractions that do much more with less effort". As an example: if I want to use regular expressions, Perl offers better abstractions than C. Similarly, over time we have learned better abstractions for messaging, and should thus be able to develop simpler standards.
There are many areas where messaging standards are needed. The main ones are:
* Connecting pieces on a single box or on a data-center LAN (the [http://en.wikipedia.org/wiki/D-BUS D-Bus] world).
* Connecting pieces across a broad LAN (the MQ-Series and JMS world).
* Connecting pieces across the Internet (the XML-RPC world).
These areas are so different that in my view there cannot be any single universal messaging protocol. What we'll see instead is a //fabric// of simpler, specialized protocols that work together because they start to embed common assumptions about what messaging should look like. The breakthrough I'm searching for is not a standard protocol as such but a standard set of //simple abstract messaging models// that let arbitrary protocols collaborate.
++ Messaging essentials
Based on my experience of AMQP, and from involvement in many other messaging projects, these are what I consider the essential aspects of a good messaging protocol:
* It carries messages as opaque blobs of arbitrary size (0 bytes upwards);
* It carries message envelopes with arbitrary properties;
* It can route messages to recipients in a variety of ways;
* It lets recipients select messages based on envelope properties;
* It can queue messages, before and/or after routing;
* It lets applications construct their coupling at runtime;
* It does synchronous control (so each control request has a response);
* It does asynchronous (event-driven) message transfer;
* It is a free and open standard.
Other aspects like performance, security, and scalability depend on the context. They are very different for inter-process communications than they are for Web use.
I've documented the main routing patterns that a messaging system can implement:
* [http://www.restms.org/wiki:housecat Housecat] - send message to recipient.
* [http://www.restms.org/wiki:wolfpack Wolfpack] - distribute work among recipients.
* [http://www.restms.org/wiki:parrot Parrot] - distribute information to recipients.
We see all three patterns crop up over and over in real life, so it seems safe to say that a messaging protocol should address all three of these (and their variations).
++ Messaging for the Web
The Internet throws up some very specific challenges. The Slashdot effect. Zero-day exploits. The Facebook effect. OK, I just invented that one, it means attention spans are limited so you only have until the next Facebook status update to get your point across.
So we can say that web messaging is about "Scalability, Security, and Simplicity":
* Scalability means working with caches, proxies, and accelerators.
* Security means using the most common, proven security mechanisms.
* Simplicity means designing smart high-level abstractions and then //getting out of the way//.
If simple is good inside the firewall, it's a hundred times better on the wild wild web. Web messaging must above all be simple to understand, simple to implement, simple to use. Web developers just won't tolerate complexity they don't absolutely need. This is the XML-RPC vs. SOAP lesson.
AMQP is a very good protocol. I like it. But it is not simple: it uses a binary wire encoding scheme that is highly efficient. Yet it's a case of premature optimization. If you login once a day, you do not care whether that login dialog uses highly efficient binary, or simpler HTTP-like text. 90% of AMQP is used no more than 1% of the time.
A cynical old programmer like me has learned to write plain, simple code that is regular and predictable and unsurprising and lazy. And one puts effort only into areas that need it, knowing that optimizing distorts the code and makes it more fragile.
So it is with messaging protocols.
A web messaging protocol should be human-readable for everything except high-volume message transfers that //need// to be optimized. And while it can use XML for expressing itself, it should be equally happy with, for example, JSON.
++ Origins of RestMS
I have not found a messaging protocol that does all these things. Some get part of the way. But one day last year, as we were thinking about how to map AMQP for web use, Steve Vinoski suggested I read the AtomPub protocol. I thought I understood what http://en.wikipedia.org/wiki/Representational_State_Transfer REST] meant but [http://www.ietf.org/rfc/rfc4287.txt AtomPub] made it much clearer.
The beauty of Roy Fielding's [http://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch_style.htm REST pattern] is that it throws away so much and leaves the essentials in a coherent, scalable structure. REST's simplicity seems to offend some people, who perhaps cannot believe that four verbs (GET, PUT, DELETE, POST) can do everything. But I've found it to be a revelation.
Very roughly, we took the AMQP messaging models, which implement Housecat, Wolfpack, and Parrot, and have those essential aspects I spoke of, and made these models work over HTTP using the RESTful rules.
And thus RestMS is that: a REST-over-HTTP messaging protocol that has those ten or so essential aspects, and in theory does it in a Scalable, Secure, and Simple manner.
++ The design of RestMS
The first drafts of RestMS were one document. This grew and grew and finally Brad Clements asked, "Pieter, please chop it up, it's becoming unreadable". So for several months we played with bits and pieces of the puzzle and eventually it took shape as a stack of specifications.
First, large parts of RestMS were very repetitive. "This is how you create a pipe. This is how you update a pipe. This is how you delete a pipe," and then "this is how you create a join, this is how you update a join, this is how you delete a join". And so on.
So a large chunk of RestMS can be abstracted into a once-and-for-all description of what resources look like (XML or JSON documents) and how they work (how to create, update, retrieve, and delete them in a RESTful way). That became the RESTful Transport Layer, or [http://www.restms.org/spec:1/RestTL RestTL]. RestTL could be a basis for any RESTful protocol that uses HTTP.
Next, we described the RestMS resources themselves: domains, feeds, pipes, joins, messages, and contents. You can [http://www.restms.org/spec:2/RestMS read the spec] itself, but very briefly, feeds are queues of messages before routing, pipes are queues of messages after routing, and joins are the connections between them.
The RestMS feed-join-pipe model supports both 'push routing', where messages are routed into queues and then fetched, and 'pull routing', where unrouted messages are held in a queue and then selected.
But at some stage we need to clearly specify how the routing works. In earlier drafts we hard-coded this into RestMS, making it dependent on AMQP routing semantics. That's not a viable option, since AMQP itself is still changing (from a push to a pull routing model).
Thus RestMS defines a third layer, "profiles". A [http://www.restms.org/wiki:profile profile] defines the semantics of a set of feed, join, and pipe types. It basically offers a plug-in messaging abstraction. Having profiles means we do not need to try to design the Ultimate Simple Messaging Abstraction today. So long as it's cheap and safe to experiment, that's fine. RestMS comes with two profiles, a [http://www.restms.org/spec:3/Defaults Defaults] profile that does Housecat and simple Parrot, and a [http://www.restms.org/spec:4/AMQP9 AMQP] profile that does more sophisticated messaging and interoperates with AMQP networks.
++ The RestMS specification process
The www.restms.org site is the hub of the small but relaxed - restful, even - RestMS community. I've mentioned that RestMS is a "free and open standard", which is a term coined by the [http://www.digistan.org Digital Standards Organization] (Digistan). If you like the politics of open standards, Digistan may be fun.
The restms.org website is worth a short story. It's based on work done by Digistan, which I'm part of. If you look at http://spec.digistan.org you'll see the template we used for www.restms.org.
In 2007, Digistan started on a project to develop peer-to-peer tools for decentralized standards development. Most standards are still built by 'Foundations' such as the XMPP Foundation. I've nothing against not-for-profit entities (and have started one or two in my time) but it is a large step to make for small teams.
The main trick to making a peer-to-peer collaborative standardization project work is clear unilateral grants by all contributors. This removes the need for a centralizing entity, at least as long as there are no trademarks to acquire. The Digistan [http://www.digistan.org/spec:1 COSS process] guarantees the right to branch and merge specifications. So if Johan likes a profile that Hubert has written but want to experiment with changes, he can branch Hubert's profile specification, and make his own. And Hubert has the right to merge any successful experiments back into his profile specification.
Peer-to-peer, branch-and-merge, and share-alike are perhaps radical ideas for standards development but I think their time has come, and they form the core of the RestMS process.
++ Where is RestMS going?
The experimentation in RestMS happens in profiles. It would be fun to make a profile for streamed messaging - high-volume publishers and subscribers. Profiles for exotic types of routing, based on geographic coordinates, regular expressions, free text matching, soundex, randomization, etc. Profiles to bridge other messaging systems. A profile for plugging pipes right back into feeds so that RestMS becomes a universal glue between arbitrary messaging systems. And so on.
There are two projects to implement RestMS servers (Thilo Fromm's [http://t-lo.github.com/ahkera/ Akhera] and iMatix's [http://www.zyre.com Zyre], and various clients projects including a RestMS [http://github.com/pieterh/restms/tree/master reference client] in Perl. It's pretty easy to build a client that sits on top of an HTTP stack and gives applications access to the various RestMS resources. A RestMS client is a bit larger than an XML-RPC one but it's possible to access RestMS with [http://www.restms.org/wiki:telnet raw HTTP].
For some people, RestMS acts as a web front-end to an AMQP network. This is how Zyre works with iMatix's [http://www.openamq.org OpenAMQ] package. For other people, RestMS is fine as a stand-alone messaging protocol.
++ Get involved
We've set-up [http://lists.digistan.org/mailman/listinfo/restms a mailing list] for discussing RestMS. That's the best place for comments and questions. Here are some of the ways you can get involved in RestMS:
* If you're a blogger, write about RestMS.
* If you're a critic, review the specs and tell us what is unclear, inconsistent or downright wrong.
* If you're a REST expert, tell us what you think of [http://www.restms.org/spec:1/RestTL RestTL].
* If you're designing a RESTful API, think about using RestTL as the basis, or tell us why not.
* If you're looking for a web messaging system, tell us whether RestMS would work, or would not.
* If you want to see how a RestMS server works, grab Thilo's [http://t-lo.github.com/ahkera/ Akhera] written in Python/Django.
* If you want to see what a RestMS client looks like, grab the Perl classes from [http://github.com/pieterh/restms/tree/master the RestMS git repository].
* If you want to experiment with protocol design, join the restms.org community, and develop your own profile.
Above all, use RestMS in applications and tell us how to improve it. We've set up a live RestMS server at [http://live.zyre.com live.zyre.com] but you can also download free RestMS servers to run on your own systems.
++ Conclusions
I've drawn a view of messaging as necessary but too-complex glue that gets simpler over time as we learn how to do it right. I've predicted we'll see a fabric of different protocols aimed at different spaces, and RestMS is one of these, aimed at web messaging. The web demands security, scalability, and simplicity, and these are built into RestMS thanks to its use of plain HTTP/HTTPS as a wire level protocol.
Part of the challenge of designing a messaging protocol is deciding how to queue and route messages. I've explained that there are several alternatives, and explained how RestMS leaves this choice open for future protocol designers, by using //profiles// to define the queuing and routing semantics. RestMS has two profiles today, and I expect that number will grow, as people experiment, and then shrink again as the best experiments become widely used.
I've explained how RestMS is built as a stack of small specifications, and how the community works, as a peer-to-peer effort that uses [http://www.digistan.org/spec:1 Digistan's COSS] lifecycle, based on branch-and-merge and share-alike. This is a new approach to specification work, maybe even radical, but it ensures that the freedom of RestMS users to stay in control is guaranteed, forever.
Finally, I've explained how to contribute, if you have an itch that RestMS can scratch for you.