Sure, you can just use RabbitMQ

December 13, 2017

Note: This post was adapted from an answer I originally posted to a Stack Overflow question.

People ask (frequently) why they need NServiceBus. “I’ve got RabbitMQ and that has built-in Pub/Sub,” they might say. “Isn’t NServiceBus just a wrapper around RabbitMQ? I could probably write that in less than a weekend. After all, how hard could it be?

Well sure, you can definitely just use pure RabbitMQ. I’ll even help you get started writing that wrapper. You just have to keep a couple things in mind.

First you should read Enterprise Integration Patterns cover to cover and make sure you understand it well. It is 736 pages, and a bit dry, but extremely useful information. It also wouldn’t hurt to become an expert in all the peculiarities of RabbitMQ.

Then you just have to decide how you’ll define messages, how to define message handlers, how to send messages and publish events. Before you get too far you’ll want a good logging infrastructure. You’ll need to create a message serializer and infrastructure for message routing. You’ll need to include a bunch of infrastructure-related metadata with the content of each business message. You’ll want to build a message dequeuing strategy that performs well and uses broker connections efficiently, keeping concurrency needs in mind.

Next you’ll need to figure out how to retry messages automatically when the handling logic fails, but not too many times. You have to have a strategy for dealing with poison messages, so you’ll need to move them aside so your handling logic doesn’t get jammed preventing valid messages from being processed. You’ll need a way to show those messages that have failed and figure out why, so you can fix the problem. You’ll want some sort of alerting options so you know when that happens. It would be nice if that poison message display also showed you where that message came from and what the exception was so you don’t need to go digging through log files. After that you’ll need to be able to reroute the poison messages back into the queue to try again. In the event of a bad deployment you might have a lot of failed messages, so it would be really nice if you didn’t have to retry the messages one at a time.

Since you’re using RabbitMQ, there are no transactions on the message broker, so ghost messages and duplicate entities are very real problems. You’ll need to code all message handling logic with idempotency in mind or your RabbitMQ messages and database entities will begin to get inconsistent. Alternatively you could design infrastructure to mimic distributed transactions by storing outgoing messaging operations in your business database and then executing the message dispatch operations separately. That results in duplicate messages (by design) so you’ll need to deduplicate messages as they come in, which means you need well a well-defined strategy for consistent message IDs across your system. Be careful, as anything dealing with transactions and concurrency can be extremely tricky.

You’ll probably want to do some workflow type stuff, where an incoming message starts a process that’s essentially a message-driven state machine. Then you can do things like trigger an action once 2 required messages have been received. You’ll need to design a storage system for that data. You’ll probably also need a way to have delayed messages, so you can do things like the buyer’s remorse pattern. RabbitMQ has no way to have an arbitrary delay on a message, so you’ll have to come up with a way to implement that.

You’ll probably want some metrics and performance counters on this system to know how it’s performing. You’ll want some way to be able to have tests on your message handling logic, so if you need to swap out some dependencies to make that work you might want to integrate a dependency injection framework.

Because these systems are decentralized by nature it can get pretty difficult to accurately picture what your system looks like. If you send a copy of every message to a central location, you can write some code to stitch together all the message conversations, and then you can use that data to build message flow diagrams, sequence diagrams, etc. This kind of living documentation based on live data can be critical for explaining things to managers or figuring out why a process isn’t working as expected.

Speaking of documentation, make sure you write a whole lot of it for your message queue wrapper, otherwise it will be pretty difficult for other developers to help you maintain it. Of if someone else on your team is writing it, you’ll be totally screwed when they get a different job and leave the company. You’re also going to want a ton of unit tests on the RabbitMQ wrapper you’ve built. Infrastructure code like this should be rock-solid. You don’t want losing a message to result in lost sales or anything like that.

So if you keep those few things in mind, you can totally use pure RabbitMQ without NServiceBus.

Hopefully, when you’re done, your boss won’t decide that you need to switch from RabbitMQ to Azure Service Bus or Amazon SQS.


Comments: