What Is Queuing?

In the computer sense, which we're discussing here, queuing is a way to get bundles of data, normally referred to as messages, from one program to another. Most descriptions of queuing use a postal metaphor, probably because it's a very good one: queuing has many features and characteristics in common with mail, both paper and electronic.

Consider: the Post Office makes a logical connection between you and the recipient over which you send your message, a letter. You initiate the connection by writing the letter and adding an address, then giving it to the Post Office. The logical connection is actually made up of a lot of physical connections and processing points: trucks, possibly airplanes, sorting facilities, etc., but you don't have to care about that. The system insulates you from that complexity. Furthermore, a lot of things can go wrong along the way, such as trucks breaking down, roads being impassable, or sorting facilities being closed, or the recipient may be busy, on vacation, or unable to reach their mailbox, but once again you don't have to care. The design of the postal system insulates you from transient errors by ensuring that each stage retains your message until the next stage is ready to take it, including the final stage, the recipient. There is no requirement for all parts of the system to be functioning simultaneously; if they are, that makes the process faster, but it still works even if they aren't.

Queuing works in a similar way, but a bit more behind the scenes. Normally a program assembles the message on your behalf, based on some activity of yours, such as entering an order. The program gives the message to the queuing service and its responsibilities are over, just as yours are once you've given a letter to the Post Office. Like the Post Office, it's up to the queuing service to figure out the specifics of how to get the message to its destination, and the process may involve multiple intermediate stops (machines) and means of transportation (network connections). In both systems, intervening stages can be down or disconnected without causing the overall process to fail. The Post Office doesn't throw your letter away because a bridge is out, or the recipient is away on vacation, or you've gone on vacation, and queuing doesn't either — both systems keep your message and move it along as conditions permit. After however many transfers are required, the message is placed in a location where the recipient knows to look for it, and the recipient picks it up at their convenience.

The key characteristic that distinguishes queuing from other types of program-to-program messaging is the temporal disconnection: there is absolutely no requirement that the sender and the recipient, or even all parts of the queuing system, be functioning at the same time. This makes queuing a poor choice if the sender needs an immediate response from the recipient, but a good one if it does not. A queuing system decouples the sender from the recipient, allowing the sender to continue working normally even if the recipient goes down or is unreachable at the moment. That sort of resiliency is great for mobile systems — the sender doesn't have to care whether it currently has connectivity, it can just give its messages to the queuing service and know that they will be sent whenever a connection next becomes available — but it's also useful for stationary systems because it reduces the uptime requirements for the recipient and the intervening network links. For example, if an order-entry system must have a live connection to the inventory system, then any fault which breaks that connection means that orders cannot be entered and has the potential for lost business. If the order-entry system uses queuing instead, orders can still be entered: the order-entry system gives the orders to the queuing service and neither knows nor cares whether the inventory system is available at the moment, allowing it to ride out many kinds of failures.

The downside of this disconnection is that it opens up the possibility of losing messages. If the sender has a live connection to the recipient, then the sender knows absolutely for sure that the recipient got the message. If you're talking to someone on the telephone, you know whether they heard you or not! By contrast, if you send them a letter, then you have to go to some extra effort to find out if they received it. Of course, not all messages are equally important. If a postcard disappears, maybe that doesn't matter; similarly, if you have a program which reports the outside temperature every five minutes, maybe it doesn't matter if one reading gets lost. On the other hand, if your tax payment gets lost, or an order doesn't reach the inventory system, that's a real problem! The Post Office and queuing have adopted similar features to allow problems to be detected or prevent them from happening in the first place. While these features aren't technically part of the definition of queuing, out in the real world they provide the reliability which makes queuing systems usable and useful.

One such feature can broadly be called "acknowledgements". With the Post Office, if a letter cannot be delivered, the sender can arrange for it to be returned. The sender can also arrange for a receipt to be returned if the letter was delivered successfully. Similarly, queuing systems allow the sender to request that the system return messages which indicate what happened to the original message.

Another such feature is having a class of service which offers greater reliability. Consider using a courier service instead of the Post Office: you handcuff your letter to a courier, and they don't release the handcuff until it has been handcuffed to the next courier, and so on. Your message could still be destroyed or stolen, but it would be very hard for it to get accidentally lost! queuing can do the same sort of thing, but even more effectively, since electronic messages can be copied: machine B can keep a copy on its disk until it receives a response from machine C that C has received the message intact and written it to its disk. If C crashed while receiving the message, then B can send it again. If C crashes after receiving the message, the message is safe on the disk and the queuing service can find it and continue the process when it is restarted. Now, C could catch fire and the disk be destroyed before it has a chance to transfer the message onwards to D, but because an electronic message can be copied, machine A, the sender, can still have a copy on disk, which it keeps until it hears from machine E, the final destination, that the message has arrived there. If A doesn't hear from E within some timeframe, A can send another copy, doing so as many times as required until it finally gets a positive response from E.

As with the Post Office, additional reliability in a queuing system comes with a cost. The Post Office (or a courier service) charges you more money for additional features because those features add to the load on the system and require more work. In queuing, acknowledgements mean more messages flowing through the system, requiring additional bandwidth and processing power. A more reliable class of service, such as the one described above, requires disk space and time to both write the message to disk and then erase it later when it is no longer needed, decreasing message throughput. In both cases, it is a decision that the sender has to make based on how important the message is and the consequences of it not reaching the recipient. Maximum reliability requires combining both of these features: a more reliable class of service decreases the likelihood of the sender having to take any action, and acknowledgements allow the sender to identify the few messages for which it must take some action.

To put it all in a nutshell: queuing is a way of sending messages between programs which allows them to send and receive messages at their convenience and removes the dependency on a direct, constant connection between the two. It is most useful when the receiver only needs to be notified of some action or event at the sender and least useful if the sender requires some action on the part of the receiver before it can continue — in that case, the design of the programs forces them to be tightly coupled and trying to use a messaging system intended for decoupling just makes things more complicated.