Posted by & filed under aws, DevOps, Message Queues, Web development.

Like many others, at Turret.IO we were tasked with deciding whether or not to use RabbitMQ or Amazon SQS for our message queueing needs. Since we operate a major of our operations inside AWS, we felt it necessary to at least give SQS a fair shot, even though quite a lot of developers seem to dislike it.

Having two distinct use cases (communicating with our internal SMTP servers and communicating with external clients) it was important to keep in mind that we might not be creating a fair comparison. RabbitMQ‘s community support along with its plugins (including authentication) make for a very robust product. SQS on the other hand is simple. It’s not designed designed to compete with the extensive configurability of RabbitMQ — but, it’s a distributed and highly available service. Establishing the same level of availability on our own with RabbitMQ would not be simple nor inexpensive.

Availability

Like most AWS services, SQS is a highly available and distributed service which makes it reliable. Messages are replicated inside AWS, making message loss due to node failure virtually non-existent.

In-line with RabbitMQ‘s extensive configuration options, there are several methods for creating a highly available RabbitMQ service. A clustering option allows multiple RabbitMQ servers to operate as a single logical server, while federation and shoveling provide ways to accept and forward messages to other servers. Clusters can also be federating themselves to create an even more resilient platform that can more closely emulate SQS‘ high availability, but it comes at the cost of more servers and moving parts to be configured and maintained.

Publishing

SQS has a fairly straightforward method for publishing messages: create a queue and then publish messages to it. It’s that simple.

RabbitMQ requires a more complicated combination of exchanges, queues, and routing keys. While it is possible to bypass them, you won’t fully appreciate the power of RabbitMQ without a basic understanding of how they work.

Consuming

Perhaps one of the most contested characteristics of SQS is the manner in which messages are consumed. Unlike RabbitMQ (and many other message queues) that supports blocking, the service must be polled for messages with an optional timeout. If no timeout is specified, the polling will result in no messages being consumed. A maximum 20 second timeout allows the client to poll and wait up to 20 seconds for a message before disconnecting. Unfortunately, this is not a very idiomatic way to consume messages from queue leaving it incompatible with messaging frameworks like Celery.

RabbitMQ on the other hand supports blocking connections enabling a client to simply sit and wait for a message to be available without the need to poll. In many cases, this is a more standard and familiar approach to consuming messages queues and it’s compatible with other messaging frameworks like Celery.

Additionally, RabbitMQ can selectively consume messages based on topics, providing the opportunity to create robust message processing schemes.

Message Acknowledgement 

Message acknowledgement is an important characteristic of some messages queues, particularly work queues. To ensure a single task is only being processed by one worker at a time, many message queues use acknowledgements (ACKs) to signal that the message has been fully consumed. If the message queue doesn’t receive the ACK after a certain amount of time, the message is considered lost and re-queued for another worker. Instead of acknowledgments, SQS uses a visibility timeout. Whenever a client begins consuming a message from the queue, a clock is started. Once a set time has passed, the message is automatically re-queued unless it’s been deleted by the client. While this does work in theory, it requires an extra step for any messages that may take considerable time to consume: resetting the visibility timeout. If a client knows the work required for a message will be longer than normal, it should reset the visibility timeout to prevent the message from being re-queued while it’s still being worked on. This additional requirement makes SQS more prone to work duplication.

RabbitMQ has two modes of message acknowledgement: ack and noack. When using noack, messages are automatically acknowledged as soon as they’re consumed. If the consumer fails to actually consume the message, it will not be re-queued. If ack is used, the client must acknowledge the message was consumed, otherwise it will be re-queued automatically once the worker is disconnected. This means that a worker that does not acknowledge a message but remains connected for an extended period of time will prevent the message from being re-requeued.

In both cases, it’s important that the worker functions are idempotent. Both SQS and RabbitMQ (when acknowledgements are enabled) guarantee at-least-once delivery — meaning it’s possible that the same message will be consumed multiple times. The developer is responsible for ensuring that processing the same message multiple times has no ill-effects.

Authentication

On the surface it appears as though SQS is lacking in the area of authentication when compared to the plugin architecture RabbitMQ provides. However, being under the AWS umbrella gives SQS access to IAM, offering read and write privileges on individual queues along with the full account management features provided by IAM.

RabbitMQ‘s authentication plugins provide hooks to authentication backends like LDAP and other SASL compatible backends, but unless you’re already running one, it’s another moving part to maintain. For the most basic authentication, plaintext options are available (which are read from a configuration file) and work without installing any further plugins. SSL would be required if authentication details are sent over the wire.

Cost

Depending on your workload, SQS may or may not be more cost efficient. Light workloads may fit under the 1M free requests/mo Amazon provides and even if you exceed that, at $0.50/1M requests, lighter loads should be much cheaper than the per-instance-hour fees for running a RabbitMQ cluster. But remember, a single process running 24×7 could produce 4,320 requests each day if polling every 20 seconds.

When your monthly publishing and consuming traffic exceeds 100s of millions of requests per month, the instance-hour charges for running a RabbitMQ cluster could be less expensive than paying the SQS per-message price. But again, the benefits of SQS‘ high availability and no-upkeep should be considered.

Conclusion

For our current needs, we ended up choosing RabbitMQ because of it’s extendable plugin system and standard approach to message acknowledgement and consumption.

 

Sign up for Turret.IO — the data-driven marketing platform for developers.

13 Responses to “RabbitMQ vs Amazon SQS: A Short Comparison”

  1. Matt Crampton

    I think you’re leaving out something important in your evaluation of these. With RabbitMQ you’re going to have to host it yourself while SQS is a managed service on AWS. For RabbitMQ you’re going to be stuck spinning up an linux machine somewhere and you’re on the hook for making sure it’s backed up, the OS has security patches added when exploits are announced, etc. This is a big concern for a small company without a dedicated ops team.

    Reply
    • tim

      Thanks for the feedback Matt!

      While I mentioned that at the end of the Availability section and a bit in the second paragraph, it probably deserves its own section.

      Reply
    • Victor R. Volkman

      I disagree. If you already have 1+ Linux machines, you already have the issue of managing security patches ad nauseum. RabbitMQ is lightweight enough it can ride along on any existing Linux machine you might have. If you already have at least on c3.8xlarge or similar, you won’t notice the additional load.

      If you somehow manage to develop in a fantasy world where no Linux hosted machines are ever needed, then yes, SQS has no hosting burden. Bully for you.

      Reply
  2. Oded Arbel

    I would have really liked to see some comments about performance. SQS is using HTTP(s) as a transport, so it has some overhead, and also I understand that it has a high latency for publish. What kind of performance can I expect from RabitMQ, by comparison?

    Reply
  3. Victor R. Volkman

    I disagree with Matt Crampton here, we have all kinds of minor outages of SQS every day. Messages fail to get enqueued and boto throws an Exception and you’re done.

    We bought the Amazon KoolAid which said that no server we could run ourselves would ever be as cheap or as reliable as SQS. I doubt RabbitMQ needs very much a 8 or 16 cpu box you’re already renting from Amazon so the expense argument holds no water.

    Reply
    • Elvis Ligu

      Probably in your region you have issues. Using the SQS for nearly 8 months now in EU West we didn’t have a single outage. It is reliable, always on, it takes no more than 5 minutes to setup a queue (or FIFO if you want to). It is one of the most simple services to use in AWS, and from a developers point of view sending, receiving messages from it is as simple as a Hello Worl program. Of course, it is not similar to Kinesis, hence the latency might be between 10ms (we benchmarked it, but this latency is mostly due to HTTP Send – Receive rather than SQS itself).

      Regarding the scaling, well we benchmarked it, and it can scale really aggressively sending and receiving thousands of messages per second (FIFO is 300 tps though). It is just a matter of how workers you have to consume and produce messages.

      Regarding cost, well using SQS you have practically 0 maintaining costs (provisioning & monitoring & alerts all provided out of the box). Having RabitMQ on the other hand and spending on average (after the initial setup) 1 developer day per month, for managing the infrastructure, you already have a cost of $200 at least which is translated to a 400M ops! I am not counting any VM costs you would need to provision RabitMQ anyway.

      Reply
  4. Mark Sorin

    RE: Managed Service: CloudAMQP offers RabbitMQ hosting on AWS, and other platforms. It is feature rich and well implemented, VPC peering, CloudWatch Exports, etc.

    Reply
  5. abiya

    Using your codes, i can easily manage SQS, thanks!
    But there is a question: At the end of May everything was fine but theresdays, when i use ‘receiveMessage’ there is a warmning:

    “SQS::receiveMessage(): Error SignatureDoesNotMatch caused by Sender.
    Message: The request signature we calculated does not match the signature you provided. Check your AWS Secret Access Key and signing method. Consult the service documentation for details.”

    But it’s functional when I use ‘createQueue’

    Many Thanks.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *