The Mechanics of Messaging in Axum

This article is in response to some great commentary from someone writing under the screen name “sylvan” over on the Channel 9 site:

This is pretty cool, but I think the semantics are overly complicated. I couldn't say that I know of a better way of doing it off hand, but I feel that there *must* be some way of making this simpler. As it stands writing agents still seems to be quite painful and clumsy, and something you would avoid doing up front, and instead do as an afterthought once you realise you need it. I think it's critical that writing agents should be as "light weight" as possible so that people write *all* their code using agents not because they necessarily believe they need them, but because they're the most convenient way of getting stuff done even when running on a single-threaded machine.

For example, there seems to be two main ways of interacting with an agent, either by just passing messages and reading from the channels, or by using request-reply ports if you want to be able to send off multiple requests and then get the reponse back while keeping track of which response belongs to which request. It seems to me that this duplication is unecessary. If you want to send multiple requests couldn't you just be required to use multiple agents, one for each "transaction" (associating a result with a given request is then trivial)? If they need to share state you could use a domain, right? I've only briefly looked at it but it does seem that the request-reply ports just complicate things and aren't actually necessary.

Also, I think first-class tuples will be very important for this, as you tend to want to make quick ad-hoc groupings of data all the time when sending and receiving messages.

The semantics and syntax of this needs to be simplified a lot to make it easier to use, it still seems that you spend far too much time and screen real-estate dealing with the details of coordination, rather than your algorithm.

There are several really important things to talk about in here, and I’ll try to get to them all.

First, let me address the easy one: tuples. Yes, you’re right, and I wish we had just gone ahead and added them from the start. F# has them, and so should Axum. Same thing with a unit type – we have added ‘Signal’ as a poor man’s unit type (no literal support), but we haven’t made it interoperate with ‘void.’ This is absolutely something we would like to fix.

On to the deeper and more subjective issues! What sylvan touches on are some fundamental design choices we made for the language, so let me elaborate on it and then you can all chime in on whether they were good choices or not.

Agents vs. Request / Reply Ports

Sylvan is absolutely right that one could create a new agent instance for correlation purposes. There are two main reasons for not always doing so and relying on request/reply ports for correlated replies:

First, you may actually want to fit the use of the port(s) into the overall protocol of the channel. This is only relevant if you have added states to the channel. There is no way to express protocols across channels (doing so may seem desirable until you consider how complex they would be to reason about), so if the use of correlated requests and replies needs to be incorporated into a protocol, this is the way to do it.

Second, there’s the issue of performance. While I would like it if agents were as cheap to use as classes, it is not the case, and we are not trying to hide it in the language. One of our core design principles is not to pretend that there’s a cost to the higher-level concepts the language introduces. Models, such as RPC, that make messaging look like method calls don’t call out the places where overhead is significantly higher than the code suggests, and I think that is bad.

There are other actor-oriented languages (I’m naming no names here) where the common perception is that messaging is so cheap that you can use agents instead of classes, but that perception is far separated from reality.

Thus, we made creating agents look different from creating classes, and we made message-passing explicit and “in your face” with operators that stand out in your code. We make you opt in to asynchronous methods, because you really don’t want to use them unless your code will block (do a receive).

I don’t actually believe that you should create agents all the time – it should be a conscious choice to deviate from object-oriented concepts. We want to strike a balance between shared-memory and message-passing in this language, and we are not trying to replace the object-oriented paradigm – within an agent, OO rules!.

Details of Coordination

Regarding sylvan’s last comment, I think I know what is meant. For example, why do I have to do all this just to respond to messages:

while (true)
{
var x = receive ( PrimaryChannel::Port1 );

doSomething ( x );
}

When all I really want is that all messages from Port1 go to ‘doSomething’?

We did this because one thing we wanted to make easier was writing very stateful agents, something that is typically quite challenging with the usual callback-based solutions: you wind up with a tangle of ad hoc state-machine goo. Our observation is that old-fashioned program counters and compiled structured control-flow is great for managing complex program state (I’m not talking of the kind of state you store in variables, but the state of the algorithm’s progress).

Thus, we have built rich support for control-flow based messaging, as described in the documentation that is now available via Dev Labs.

Let me then come back to sylvan’s issue – spending too much screen real estate on the mechanics of messaging. An early prototype of the language had only the control-flow-based messaging, and this was, as pointed out, problematic. Even if we don’t have to build completely stateless agents because we’re not distributing all parts of the application, it will still be the case that many agents will be mostly stateless and that even stateful agents can handle some of their messages using patterns typical of statelessness.

This was in fact how we came to introduce the data-flow concepts into the language: the desire to just forward messages to a method lead to a generalization where you can build pipelines of methods that messages are passed through and possibly out again. The simplest network, which corresponds to hooking a callback to a port is the forward operator:

PrimaryChannel::Port ==> doSomething;

We think the generalization of forwarding messages into the network concept is valuable because it allows for another form of parallelism through pipelining: each stage in a pipeline can run in parallel, subject to the same reader / writer rules that other agent and domain code is subject to, but more fine-grained (much less costly than creating a new channel for a new agent).

Networks also allow us to forward not just to methods, but to buffers of various kinds, such as queues and single-assignment variables.

Setting all this up in the agent constructor is really easy and something we find ourselves doing all the time. However, it is still very programmatic. We have considered a more declarative approach, something similar to VB’s ‘Handles’ syntax, which would be useful for the most common network: forwarding from a port to a method. It would be interesting to hear your thoughts on this.

Channels

Why do we require you to define channels? This also wastes screen real estate and is undoubtedly cumbersome. Why not just deal directly with agents? The reason is that tightly coupled component models invariable lead to brittle programs that do not easily allow themselves to be distributed nor partially re-implemented without breaking a lot of working code. By taking a hard line on loose coupling, we are hoping to establish that Axum is not about cutting corners: safe parallelism will require a level of formalism and rigor between components that hasn’t been common to date.

We believe that you either pay the price by doing more stuff upfront when designing your components and their interfaces, or later when you are trying to debug your already deployed application on a client-owned server.

That said, there could be much better ways of accomplishing this than what we have come up with, so don’t take the above as a dismissal of the concern. On the contrary, I share sylvan’s interest in making it much easier, I just don’t know how to do so (yet) without compromising what I consider some pretty critical aspects of the language model.

Thanks,

Niklas Gustafsson