-
Notifications
You must be signed in to change notification settings - Fork 0
Hack day
The hack day will center around a new media type designed for machine-to-machine interactions. The media type vnd.WORK_ORDER
is pretty generic in nature, and falls under the domain of "workflow" or "job control". Documents of the type vnd.WORK_ORDER
will be provided by servers that have work that needs to be done; agents that are looking for work would find them somehow, and will react to the vnd.WORK_ORDER
media type by performing the work described, and finally invoke the hypermedia controls contained therein.
The intended use of the media type is for agents that can perform some specialized task, like sending an e-mail, posting a tweet, sounding an alarm, powering on a cluster of machines or getting some information from a human being.
meta comment. the term
vnd.WORK_ORDER
is a placeholder until I choose a good name. Possibilities includeapplication/vnd.mogsie.work-order+json
andapplication/workflow+json
;
Put simply, the media type describes "work that needs to be done, and what to do when you're done". When an agent receives a document of this type, it may find among other things:
- what needs to be done
- any information about the work that needs to be done
- how progress about the work can be reported
- how to indicate that work has been completed
Using this media type it will be possible to construct machine agents (automotons) that, when they discover a vnd.WORK_ORDER
document which conforms to the agent's requirements, will be able to perform their (narrow) work, and report back when they're done.
In order to achieve machine-to-machine hypermedia, the media types and affordances need to be laid down so that messages can be exchanged and understood.
Just like browsers understand HTML and CSS and so on, a machine-to-machine agent needs to understand certain media types minted for machine work. An example of this is Collection+JSON which allows machine clients to understand collections of things, and even manipulate them.
In this hack day we will be working with a media type which describes "work that needs to be done". The type of work itself is not defined in the media type definition; just as Collection+JSON doesn't specify what things are in the collection, be they blog posts or line items. The work can be anything, like deploying some code, shutting down a server or tweeting a message.
(hopefully) we have a functioning server that has a collection of work waiting to be performed. In this hack day we hope to get several agents that can perform some of the work, and maybe other servers that might provide other types of work too.
Here's an example of an exchange between a client that's able to perform some work, and a server that provides work orders. Here "WORK_ORDER" signifies the media type that describes one piece of work that needs to be done. The exchange is probably the simplest of exchanges.
Like a browser needs a home page, the agent starts out with the URL it was configured by its owner to start at. In our example it sees a Collection+JSON which contains perhaps three or four items, retrieves one of the vnd.WORK_ORDER
s, performs the work described in the work order, and finally reports back:
> GET /work-queue
< 200 OK Content-Type: collection+json
# the agent is now looking at a collection of (open) work orders.
# /work-orders/su9fw is one of the items. Follow that link:
> GET /work-orders/su9fw
< 200 OK Content-Type: vnd.WORK_ORDER
# agent is now looking at a work order which we'll assume the agent knows how to handle
# it also finds out where to report completion of work (e.g. /work-orders/su9fw/complete)
... the agent does what it was told ...
# The agent has completed the work; in the work o
> POST /work-orders/su9fw/complete (some data)
< 204 NO CONTENT
# Agent goes back to the queue, and grabs another item, and so on ad infinitum
Here, the agent starts by finding a list of work items (GET /work-queue
), finds one (GET /work-orders/su9fw
), performs the work described and then invokes the hypermedia control in the work order to report back that the work completed (/work-orders/su9fw/complete
).
Now let's look at an example work order itself.
It's a JSON document!
{
"type": "send-a-tweet"
"input":
{
"handle" : "mogsie",
"message": "Hey, Erik, check this out!!"
}
"start": "..."
"progress": "..."
"complete": "..."
"fail": "..."
}
There are two main parts to a work order:
- the input to the agent identifying the work to be done.
- the hypermedia controls used to advance the state of the agent.
The type
and input
attributes are all the worker needs to have in order to determine if the work at all fits the agent. If an agent is good at sending e-mails, it shouldn't try to perform work that is clearly labelled that it has to do with stopping a server.
Type provides agents with a way to understand domain specific types of work (here "send-a-tweet") for agents to figure out if they are supposed to even attempt carrying out the work shown.
There will be a level of coupling between the agents themselves and the component responsible for enqueuing work for agents to discover. This coupling will be where a lot of the the "domain specifics" and "semantics" will be.
It is expected that a registry of types be maintained to increase interoperability between workers.
consider requiring an URN or absolute URI for the type attribute.
This attribute contains arbitrary information needed to carry out the work; this is an open-ended JSON structure, which is specific to the domain of the worker.
The contents of the input
will vary wildly depending on the type
of work being described. For a work order describing the need to send a tweet the input
might need to state the message of the tweet. On the other hand, a work order describing that a certain sound should be played on the machine's internal speaker, probably has a URL for the sound.
It is expected that the registry for types of work includes a description of the requirements for the input for each type.
These controls identify other resources that govern the work itself. Each of these controls mean something very specific, and an agent should follow them when certain situations occur. Most of the controls are non-idempotent, and unsafe, POST . These "controllers" allow the agent to inform the server about the agent's progress and ultimately complete the work.
When this control is present, an agent SHOULD invoke it and await a successful response before work starts. This control is added to work items where it is important that the same piece of work isn't worked on by two agents at the same time. The server in question might achieve this by only responding successfully once; only one agent would "get" the work item.
POST (something ....)
...
{ "about" : "something about the agent perhaps, for logging purposes" }
If the response is not successful, the agent MUST NOT start working on the work, and disregard the item. In this case the agent should continue to find other work to do.
For 5xx server errors, the client may of course retry at a later time, as specified in HTTP. For 4xx errors, the client should respond as appropriate. In particular, a 409 CONFLICT typically indicates that the item is not available anymore, possibly because another agent has already started work.
When this control is present, the agent MAY invoke it whenever it wants to report progress about the work. Progress may be reported using a complete factor (a decimal from 0 to 1) and/or an estimate of how much time is left (in seconds):
POST /...
Content-Type: TBD
{ "factor": 0.7847, "remaining": 92 }
The response from this is a document that describes the current state of the work order (see cancel
, below).
For long running work (e.g. several minutes) the agent SHOULD report progress every few minutes to ensure that the origin server doesn't think that the agent has died.
When the complete
control is present, the agent MUST (SHOULD?) invoke it when work has been completed, optionally passing along with it the results of the work, if applicable. Any media type goes; some agents want to provide results as a binary file (e.g. a JPEG image), others just some JSON data or plain text.
POST ...
Content-Type: anything
The format of the complete
method is governed by the type
of work. Again, it is expected that the registry that describes the types of work will include information about the desired response (their media type)
A successful response indicates that the work has been handed off, and the agent can continue looking for work, typically by backing up a few paces.
TBD: response codes; if you get 200 here isn't the agent really obliged to process the response?
Cancellation is something that happens in two phases. First of all, the agent needs to be reporting its progress; the response from the progress MAY indicate that the server no longer wants the work to continue.
200 OK
Content-Type: TBD
{ "state" : "cancelled" }
If this happens, and the cancel
control is present, the agent SHOULD invoke it. The cancel
control informs the origin server that the agent successfully cancelled the work. If the agent is unable to cancel the work, it should ignore the desire to cancel the work and complete
the work as normal.
When the fail
control is present, the agent SHOULD invoke it if it has started progress on a work item, but is unable to complete for whatever reason. This control informs that the work could not be completed, and the agent is no longer working on this work.
The work order types MUST be URLs that SHOULD provide documentation about its intended use, inputs and outputs and other non-functional requirements.
Plays a sound on a machine's speakers.
-
uri
: (required string) URI of the sound to play -
volume
(optional number between 0 and 100) volume to play the sound
none
It would be cool to have the attendees write small workers that can do various things like play sounds on their speakers, or take a photo with a webcam, or do face recognition on photographs, and have servers that pass out work to these agents as it sees fit.
To do this, we would need a server that can add work to various queues; we would need to define some different types of work with their inputs and outputs; not very hard.
It would also probably be useful to have a few simple workers written in JavaScript in a browser, and have all the attendees open up workers on their phones, laptops, tablets, and see the result; this should probably be done at the beginning, to ensure that the attendees have something to tinker with so that they can grasp the architecture of it all.
Metacomment
At the moment there is no server that implements this; but it shouldn't be too hard to whip one up before REST Fest, at least one that provides the MUSTs.
An additional piece of this problem domain is that of deciding what work goes into what queue.
Traditional thinking might lead to an implementation where the server does all this, thus introducing the semantic coupling between the agents and the server. A better and more decentralized idea is to build upon the notion of workers, and introduce a similar media type which deals exclusively with the assignment of work. This allows a "decision maker" agent to literally be in control over the other "worker agents" by instructing the server about what work should be in what queues, based on what work completes where and so on.
This would allow the agents to again be decoupled from the server components (since the servers wouldn't know what on earth "send a tweet" is), and keep the coupling between agents.
to be continued