Like many others, at Turret.IO we were tasked with deciding whether or not to use RabbitMQ or Amazon SQS for our message queueing needs. Since we operate a major of our operations inside AWS, we felt it necessary to at least give SQS a fair shot, even though quite a lot of developers seem to dislike it.
Having two distinct use cases (communicating with our internal SMTP servers and communicating with external clients) it was important to keep in mind that we might not be creating a fair comparison. RabbitMQ‘s community support along with its plugins (including authentication) make for a very robust product. SQS on the other hand is simple. It’s not designed designed to compete with the extensive configurability of RabbitMQ — but, it’s a distributed and highly available service. Establishing the same level of availability on our own with RabbitMQ would not be simple nor inexpensive.
In-line with RabbitMQ‘s extensive configuration options, there are several methods for creating a highly available RabbitMQ service. A clustering option allows multiple RabbitMQ servers to operate as a single logical server, while federation and shoveling provide ways to accept and forward messages to other servers. Clusters can also be federating themselves to create an even more resilient platform that can more closely emulate SQS‘ high availability, but it comes at the cost of more servers and moving parts to be configured and maintained.
SQS has a fairly straightforward method for publishing messages: create a queue and then publish messages to it. It’s that simple.
RabbitMQ requires a more complicated combination of exchanges, queues, and routing keys. While it is possible to bypass them, you won’t fully appreciate the power of RabbitMQ without a basic understanding of how they work.
Perhaps one of the most contested characteristics of SQS is the manner in which messages are consumed. Unlike RabbitMQ (and many other message queues) that supports blocking, the service must be polled for messages with an optional timeout. If no timeout is specified, the polling will result in no messages being consumed. A maximum 20 second timeout allows the client to poll and wait up to 20 seconds for a message before disconnecting. Unfortunately, this is not a very idiomatic way to consume messages from queue leaving it incompatible with messaging frameworks like Celery.
RabbitMQ on the other hand supports blocking connections enabling a client to simply sit and wait for a message to be available without the need to poll. In many cases, this is a more standard and familiar approach to consuming messages queues and it’s compatible with other messaging frameworks like Celery.
Message acknowledgement is an important characteristic of some messages queues, particularly work queues. To ensure a single task is only being processed by one worker at a time, many message queues use acknowledgements (ACKs) to signal that the message has been fully consumed. If the message queue doesn’t receive the
ACK after a certain amount of time, the message is considered lost and re-queued for another worker. Instead of acknowledgments, SQS uses a visibility timeout. Whenever a client begins consuming a message from the queue, a clock is started. Once a set time has passed, the message is automatically re-queued unless it’s been deleted by the client. While this does work in theory, it requires an extra step for any messages that may take considerable time to consume: resetting the visibility timeout. If a client knows the work required for a message will be longer than normal, it should reset the visibility timeout to prevent the message from being re-queued while it’s still being worked on. This additional requirement makes SQS more prone to work duplication.
RabbitMQ has two modes of message acknowledgement:
noack. When using
noack, messages are automatically acknowledged as soon as they’re consumed. If the consumer fails to actually consume the message, it will not be re-queued. If
ack is used, the client must acknowledge the message was consumed, otherwise it will be re-queued automatically once the worker is disconnected. This means that a worker that does not acknowledge a message but remains connected for an extended period of time will prevent the message from being re-requeued.
In both cases, it’s important that the worker functions are idempotent. Both SQS and RabbitMQ (when acknowledgements are enabled) guarantee at-least-once delivery — meaning it’s possible that the same message will be consumed multiple times. The developer is responsible for ensuring that processing the same message multiple times has no ill-effects.
On the surface it appears as though SQS is lacking in the area of authentication when compared to the plugin architecture RabbitMQ provides. However, being under the AWS umbrella gives SQS access to IAM, offering read and write privileges on individual queues along with the full account management features provided by IAM.
RabbitMQ‘s authentication plugins provide hooks to authentication backends like LDAP and other SASL compatible backends, but unless you’re already running one, it’s another moving part to maintain. For the most basic authentication, plaintext options are available (which are read from a configuration file) and work without installing any further plugins. SSL would be required if authentication details are sent over the wire.
Depending on your workload, SQS may or may not be more cost efficient. Light workloads may fit under the 1M free requests/mo Amazon provides and even if you exceed that, at $0.50/1M requests, lighter loads should be much cheaper than the per-instance-hour fees for running a RabbitMQ cluster. But remember, a single process running 24×7 could produce 4,320 requests each day if polling every 20 seconds.
When your monthly publishing and consuming traffic exceeds 100s of millions of requests per month, the instance-hour charges for running a RabbitMQ cluster could be less expensive than paying the SQS per-message price. But again, the benefits of SQS‘ high availability and no-upkeep should be considered.
For our current needs, we ended up choosing RabbitMQ because of it’s extendable plugin system and standard approach to message acknowledgement and consumption.
Sign up for Turret.IO — the data-driven marketing platform for developers.