Robust 3rd Party Integrations with NServiceBus

A common question about NServiceBus is how to use it to integrate with an external partner. The requirements usually go something like this:

  • The third party will contact us via a web service, passing us a transaction identifier and a collection of fields.
  • If we successfully receive the message in the web service, we respond with a HTTP 200 OK status code.  If they do not receive the acknowledgement, they will assume a failure and attempt to retry the web service later.
  • Once we receive the message from the third party, we need to distribute (think publish) the contents of the message to more than one internal process, each of which are completely independent of each other.
  • We need to logically receive each message once and only once. In other words, it would be a “Very Bad Thing” for one of the internal subscribing processes to receive the same notification more than once.

This was most recently asked in this StackOverflow question, where it became difficult to explain more within the 600 character comment limit. The best explanation is example code, so here it is.

Check out NServiceBus External WebService Example on GitHub. Here is a high-level overview of the project:

  • WebServiceHost
    • This project implements a simple ASMX web service.
    • The web request information is translated into an NServiceBus command message, and then Sent on the Bus.
  • TestClient
    • This console app project tests invoking the web service using a standard .NET web service proxy. No NServiceBus to be found here.
  • InternalService
    • An NServiceBus endpoint containing a Saga that receives the NServiceBus message from the web service.
    • Even if the web service received the message successfully, we can’t know that our partner’s server didn’t fail before they received, or were able to record the acknowledgement, or that a network failure didn’t prevent our reply from arriving at all.
    • Because of this, it’s possible that our partner may retry sending the message even though we’ve already received it once.  This means we may receive duplicate messages, and we need to insulate our internal processes from that.
      • To do that, we accept and publish an event corresponding to the first message received.  We also store the fact that we received that message in saga data so that we will know to ignore any duplicate messages.
      • We also request a timeout notification from the Timeout Manager so that after some reasonable period (after we know the partner could no longer possibly be retrying) we can clean up the saga data.
  • InternalService.Messages
    • This assembly contains all the message schema for the project, including:
      • ExternalServiceMsg – the web service sends this to InternalService.
      • IExternalMessageReceivedEvent – the event that is published upon receipt of a non-duplicated ExternalServiceMsg. We are using an interface to define the event, which is recommended as it enables easier versioning later on thanks to an interface’s multiple inheritance abilities.
      • ExternalServiceSagaData – this defines the state data our saga will use to keep track of which messages it has received.
    • Note that it is probably NOT best practice to keep all these things in the same assembly.  Udi would probably recommend that the command and saga data (which are internal to the logical service) be segregated from the event (which forms the external contract for the service).
  • Subscriber1 and Subscriber2
    • These endpoints subscribe to the IExternalMessageReceivedEvent and pump some info to the Console so that we can watch it happen.
  • ExampleTimeoutManager
    • For active development, you should probably just keep a Timeout Manager running as a service on your development machine, but this is provided to keep the example self-contained and as an example of how to configure the Timeout Manager package available from NuGet.  I used a fairly nonstandard queue name so that it won’t conflict if you do already have a running timeout manager on your system.

The project uses the following NuGet packages to make it as easy as possible to get started:

  • Log4Net, as a dependency of NServiceBus.
  • NServiceBus – for the core NServiceBus DLLs.
    • A messages assembly, however, doesn’t need NServiceBus.Core.dll or Log4Net.dll, so you can just “Install-Package NServiceBus” on this assembly and then manually remove those two references.
  • NServiceBus.Host – for the InternalService endpoint.
  • NServiceBus.TimeoutManager

The project also uses the NuGetPowerTools package to automatically download all the required packages when you build the solution as described in this article from David Ebbo.  I highly recommend it.

Updates November 11, 2013 – NServiceBus 4.2

It has been almost 2 years since I originally wrote this blog post, and Mark Holdt asked in the comments if there was anything I would do different now? Based on the passage of time and the changes from NServiceBus 2.6 (in which the code was originally written) to today’s current version of 4.2, yes there are quite a few things that would be different.

Hopefully I’ll get a chance soon to update this code, but until then, here are some things that would be different today:

  • As of V4, NServiceBus no longer has a dependency on log4net, so I would probably remove that dependency and use the built-in NServiceBus.Logging namespace (which is an API copy of log4net) instead. If I wanted to use log4net or NLog I could drop that in, but it’s much easier to just keep the example simple.
  • There would be no external timeout manager. As of V3, the timeout manager is integrated with the NServiceBus Host.
  • Obviously some API changed from NServiceBus 2.6 to 4.2. Off the top of my head:
    • MsmqTransportConfig to set the input/error queues, number of worker threads, and max retries is now deprecated and replaced by some different sections that are more accurate based on MSMQ not being the only transport available anymore. I’m pretty sure the runtime warnings would provide pretty good pointers on what to update.
    • The Timeout API has changed to reflect that you don’t have to send timeout messages anywhere – it’s handled by the internal timeout manager much more cleanly.
  • In the original the saga data is in the InternalService.Messages assembly. My thinking on this has changed quite a bit in the past 2 years. Saga data is the storage for the saga and completely internal to its implementation. Nobody else has any business knowing anything about it! Therefore I would put it in the same assembly as the saga itself (InternalService) potentially even as a nested class inside the Saga.
  • The web service: Now that Microsoft has officially declared ASMX web services to be a “legacy technology” it is hard to recommend their use anywhere. I really didn’t mind ASMX services at all and I really despise WCF. Therefore I would try to implement the web service with WebAPI if possible, or even as a vanilla ASP.NET MVC action method. It would really depend upon the external partner’s abilities.

Other than these things, everything remains pretty much the same. The overall concept of how to handle this situation with messaging, after all, is still sound.

Related Posts:

  • Tezler

    Amazing. Can’t thank you enough for the explanation and the time you are giving. I hope it will help others as much as this has helped guide me.

    Just a few queries regarding the implementation:

    1. The subscribers although interested in a message may ignore the message if the message data doesn’t concern them i.e. subscriber filtering. Is this normal?

    2. Whether wrong or not, the third party will continually (no timeout) try to resend the messge until they do get a 200 response. SAGAS in this scenario do not seem to help solve the duplicate message issue. We do know the transaction, can this be used? Is it ugly for the subscribers handler to possibly as part of the orchastration check a db or similar to see if it contains the transaction id? Obviously part of the orchastration would have to then log the trasnaction id too.

    3. This is a big one because I can’t fit it into the picture. The third party wants us to notify them when the one message in their terms has been completely processed by us. i.e. we have done whatever we are going to do with that message. Reason for this is they show the cycle in an audit log at their side. Notification sent > notification accepted > notification process complete.
    I was thinking maybe the subscribers could publish a ‘complete’ message itself and have another subscriber listen out ‘in saga fashion’ for all the other complete messages, handling and calling back to the third party when all complete messages have been received. One issue with this I see is, I don’t and shouldn’t know how many subscribers in the first place there was so don’t know when the whole thing has been completed or not.

    … if you can’t tell my head is spinning! I’m trying to convince myself and my PM nServiceBus can help BUT i’m finding it really hard to convince him where this will help and not hinder. Not that he or I have any alternative.

    Once again I thank you though.

  • David Boike

    @Tezler – I meant to reply to this long ago and I apologize. My wife and I had a baby and so I suppose these things happen.

    #1 – yes, subscribers will receive all messages and it is just fine for them to decide they aren’t interested in particular messages based on the content. That is their right, and the publisher doesn’t have to know about it (since it doesn’t really know or care about the subscribers at all).

    #2 – Sagas do solve the duplciate message issue if implemented correctly. The first message received successfully by the web service (whether or not the remote sender is aware of the success or not) will flow through the saga and be published. Any additional web service hits with the same ServiceTransactionIdentifier will be swallowed up by the Saga, because the Saga will look up the Saga Data (based on that identifier), see that the LastSeen property of the Saga Data was not equal to DateTime.MinValue, and then take no action (i.e. not publish a follow-up event.)

    #3 – If the third party wants notification when EVERYTHING is done on your side, that really breaks the ideal disconnected, eventually consistent nature of the distributed system. This probably comes from their distrust of anything they can’t directly see. I would attempt to push back on this requirement, on the grounds that it is expensive to implement, due to the extra coupling created between your organizations that must be maintained long-term, with very little added REAL value, assuming you live up to your end of the bargain and process what you promise to process.

    That said, if you MUST do it, here are a few suggestions:

    #3a: Each subscribing endpoint individually updates a database record regarding their progress, and the third-party organization could be given a window to this via a read-only web service or REST API.

    #3b: While you can’t have each subscribing endpoint publish the same event to indicate completion (events are published by only one logical subscriber) each one could perhaps do a Bus.Reply(), which would send messages back to the Saga, which could take appropriate actions, including access the original saga data. Take care when doing this, however, not to make the Saga really aware of the subscribers, as creating that coupling would be counter-productive.

  • David Boike

    On #2:

    There isn’t a reason that a saga MUST ever end, only if it makes sense for that particular business application. While I would say that an incident of the magnitude you describe (previously acknowledged messages being resent 6 months later) is HIGHLY unlikely, if it’s that important to guard against that, then yes, those sagas could stay alive forever – all that means is they still exist in the saga storage.

    Another way to handle that kind of problem is to require a timestamp on the messages. The message must be less than 1 week old to be processed, and sagas are completed after 2 weeks, for example.

    On #3:

    I believe the most common use case is for the external entity not really caring about all of what occurs in subscriberland, only that the ownership of the message was transferred.

    Now, there may be valid use cases where the external entity DOES care about the completion of Tasks A, B, and C after the message is received. If that is the case, then we’re really not talking about Pub/Sub anymore. In that case, the “publisher” has intimate knowledge of its subscribers, so it’s really not a “publisher” in the classical sense.

    In that case, I wouldn’t publish anything. I would have my Saga, which must know about Tasks A, B, and C, send messages to the processors for A, B, and C, each of which would reply with a DoneWithA, DoneWithB, and DoneWithC message upon completion. Then the Saga could track the completion of all those tasks and do whatever is necessary when all were complete.

  • Mark Holdt

    This is still excellent work & well explained!

    Given that this is two years ago, would you do anything different now?
    WCF possibly rather than ASMX Web Service?

  • David Boike

    Great question! I added some updates to the bottom of the post to reflect reality in November 2013 and NServiceBus 4.2.