Queues are like Singletons

With a friend of mine I talked about this article you just started reading. After I told him my arguments, he said Queues are like Singletons. Here is why:

Queues come under a few different names, mostly based on the context they are used in. They run under log, PubSub or messaging queue. They offer a client to publish messages to a topic. They also offer clients to subscribe to topics and read the posted messages. (hence the name PubSub).

Most queues share a property called “at least once”. That means a subscriber can receive the same message one or more times from the queue. A property that would be more beneficial is “exactly once delivery”. This is way harder to achieve when designing a queue, so it is usually either unavailable or not a default in these products. This matters because some software developers are not aware of the “at least once” property and write software in a way that pretends there will never be the same message processed twice. Lets say an order management system places a customer order in the queue. A fulfilment agent pics up that order from the queue, ships an item and, oh no, ships the same item to the same customer again. This will be a substantial loss. If a developer is aware of this property, it is not as bad anymore. I’ve seen developers being aware but ignoring the property. Pretty bad.

So what else is there to say about hidden properties? Well, queues can have a capacity. And this capacity usually is infinite. What is so bad about an infinite queue you might wonder. All computers have finite resources. That means every queue will run full eventually. With a full queue, a publisher wont be able to submit further messages to the queue, so the publisher is forced to notify its own clients of an unavailability.

It is better to work with a small queue capacity from the beginning. You will quickly experience the effects of a full queue and learn to take action accordingly. The publisher should not cache to-be-published items, because caches are finite too, so eventually you will deal with the same problems, just at different locations (or in the worst case you have two problems instead of one. The first being an exhausted queue and the second an exhausted cache.)

The queue (hopefully you’ve set it to a low capacity) runs full for a reason. Usually it is the subscriber of a topic that either can‘t keep up with the number of messages being added, or it experiences an outage of its own or of one of its dependent service. In that case the queue runs full, the publisher will notice und stop doing its own business. Pretty straight forward.

There is another more problematic case: the subscriber does not understand a message.

The subscriber has three options in that situation:

  1. Acknowledge the message but discard it
  2. Re-queue the message (good luck with low capacity queues)
  3. Put the message into a dead letter queue

With (1), the publisher will never know that something is wrong. I would say this is the most harmful approach. (2) is self—explanatory. Re-inserting a defective message will not benefit anyone.

The dead letter queue (3) is an interesting take. It is rather unlikely that a publisher looks there. After all, messages can end up at dead-letter for all kinds of reasons, including full queue (here we are again), expired message TTL, or read threshold, that is, the message was read too often. Not very helpful for a publisher to react in an automated way.

In HTTP there is status code 406 Not Acceptable. Isn’t that nice means of communication? A service explaining the client “what you send I can not process". Not with queues though. If a subscriber can’t accept a message, you will need to invent a backchannel to tell the publisher and find a business logic to cover the case. The publisher’s own client is gone already, so the publisher itself has a problem notifying their own clients. You end up at an asynchronous behaviour whereas you could have an easier to manage synchronous behaviour without a queue.

Coming back to the initial statement that queues are like singletons, where does that come from? As you read, queues have some hidden properties that will show their problematic properties only when you need them most, under load in production. But these properties are not obvious. That makes it so easy to use them in big architectural diagrams. It will look really neat. All services speak to only that queue. Need another service-to-service communication? Throw it in the queue!

The real work is to write resilient software that can handle the queue’s properties. Every service will need that inherent knowledge of dealing with asynchronicity.

My friend and I had the same professor at university. His words remain “In object oriented programming, you will be tempted to use Singletons in many situations. Resist that temptation!”. This is what my friend meant. In system architectures, you are often confronted with problems that look like you will be able to solve them with a queue. Resist that temptation! And think about alternative options first.


Queues have their right to exist, and there are good use cases for them. In my opinion they shine in high-throughput systems when many entities need to stay informed about an ongoing situation. If one missing (or double) message won’t break anything. Speaking about sensor data, refreshing caches, or big-picture analysis of data streams, for example network throughput or user interactions on a website.

By Raphael Sprenger licensed under CC BY-NC 4.0