Event Synchronization with NATS #970

beatrausch · 2022-05-04T18:43:53Z

beatrausch
May 4, 2022

Hi,

I am developing a kind of ordering system in which an order run through different states. In a certain state, an order must be processed by multiple consumers (independent subsystems). Afterwards it is required to wait that all consumers have been processed the order before moving on.
What would be the best way to achieve that with nats?

With Kafka I basically would rely on its ordering guarantee related to is partition-key, thus each message related to an order would arrive on the same consumer (client), in which the synchronization can take place. As far as I know, nats does not support that.

My first approach/idea: simply publish a synchronization event for each single order. Thus, each synchronization event will be picked by a random client based on its queued subscription. That client can hold on until each subsystem have signaled that their tasks have been finished.

Regards

Answered by jnmoyne

May 5, 2022

As it happens NATS does now (as of nats-server v2.8) support this very feature of being able to deterministically partition streams of messages: i.e. the same functionality as Kafka's partitions, but done in a 'NATS way'!

What makes this possible is actually additional functionality to an existing Core NATS functionality: subject mapping now allows you to, besides changing and re-ordering subject name tokens, also insert a partition number token with the partition number being automatically calculated for each message using a deterministic hashing of one or more of the subject's tokens. This allows you to scale many things: consumers requiring strict ordering (as in your case) is one but …

View full answer

jnmoyne · 2022-05-05T01:44:02Z

jnmoyne
May 5, 2022
Collaborator

As it happens NATS does now (as of nats-server v2.8) support this very feature of being able to deterministically partition streams of messages: i.e. the same functionality as Kafka's partitions, but done in a 'NATS way'!

What makes this possible is actually additional functionality to an existing Core NATS functionality: subject mapping now allows you to, besides changing and re-ordering subject name tokens, also insert a partition number token with the partition number being automatically calculated for each message using a deterministic hashing of one or more of the subject's tokens. This allows you to scale many things: consumers requiring strict ordering (as in your case) is one but because this is a core NATS functionality it can actually also be used to scale Core NATS subscribers (deterministically, unlike queue groups), or even be used to speed up processing by leveraging local caching of the joined data on workers more efficiently.

The functionality is the same but let me try to compare how NATS does it vs Kafka

In Kafka partitioning is something the application developer needs to be aware of if they want to be able to use partitioning, in NATS the partitioning happens at the administrative (including dynamically changing the number of partitions) level, publishing applications do not need to know if they are publishing to a partitioned stream or not, and the subscribing applications only need to have a partition number as part of their run-time configuration (e.g. provided by the container orchestration system).
In Kafka, there is only one topic per stream, and partitioning happens on a key provided by the publishing application. In NATS streams can capture any number of topics (using wildcards) and the key (or the parts of a composite key) are part of the subject name.
In Kafka you partition a stream, while in NATS you leverage subject based addressing to split a message stream into partitions and create one stream per partition.

Here is a preview of the upcoming update to the documentation regarding this new feature:

Subject Mapping and Traffic Shaping

Subject mapping is a very powerful feature of the NATS server, useful for canary deployments, A/B testing, chaos testing, and migrating to a new subject namespace.

There are two places where you can apply subject mappings: each account has its own set of subject mappings, which will apply to any message published by client applications, and you can also use subject mappings as part of the imports and exports between accounts.

When not using operator JWT security, you can define the subject mappings in server configuration files, and you simply need to send a signal for the nats-server process to reload the configuration whenever you change a mapping for the change to take effect.

When using operator JWT security with the built-in resolver you define the mappings and the import/exports in the account JWT so after modifying them they will take effect as soon as you push the updated account JWT to the servers.

Simple Mapping

The example of foo:bar is straightforward. All messages the server receives on subject foo are remapped and can be received by clients subscribed to bar.

Subject Token Reordering

Wildcard tokens may be referenced by position number in the destination mapping using (only for versions 2.8.0 and above of nats-server) {{wildcard(position)}}. E.g. {{wildcard(1)}} references the first wildcard token, {{wildcard(2)}} references the second wildcard token, etc...

You can also (for all versions of nats-server) use the legacy notation of $position. E.g. $1 references the first wild card token, $2 the second wildcard token, etc...

Example: with this mapping "bar.*.*" : "baz.{{wildcard(2)}}.{{wildcard(1)}}", messages that were originally published to bar.a.b are remapped in the server to baz.b.a. Messages arriving at the server on bar.one.two would be mapped to baz.two.one, and so forth.

Deterministic Subject token Partitioning

Deterministic token partitioning allows you to use subject based addressing to deterministically divide (partition) a flow of messages where one or more of the subject tokens make up the key upon which the partitioning will be based, into a number of smaller message flows.

For example: new customer orders are published on neworders.<customer id>, you can partition those messages over 3 partition numbers (buckets), using the partition(number of partitions, wildcard token positions...) function which returns a partition number (between 0 and number of partitions-1) by using the following mapping "neworders.*" : "neworders.{{wildcard(1)}}.{{partition(3,1)}}".

This particular mapping means that any message published on neworders.<customer id> will be mapped to neworders.<customer id>.<a partition number 0, 1, or 2>. i.e.:

Published on	Mapped to
neworders.customerid1	neworders.customerid1.0
neworders.customerid2	neworders.customerid2.2
neworders.customerid3	neworders.customerid3.1
neworders.customerid4	neworders.customerid4.2
neworders.customerid5	neworders.customerid5.1
neworders.customerid6	neworders.customerid6.0

The mapping is deterministic because (as long as the number of partitions is 3) 'customerid1' will always map to the same partition number. The mapping is hash based, it's distribution is random but tending towards 'perfectly balanced' distribution (i.e. the more keys you map the more the number of keys for each partition will tend to converge to the same number).

You can partition on more than one subject wildcard token at a time, e.g.: {{partition(10,1,2)}} distributes the union of token wildcards 1 and 2 over 10 partitions.

Published on	Mapped to
foo.1.a	foo.1.a.1
foo.1.b	foo.1.b.0
foo.2.b	foo.2.b.9
foo.2.a	foo.2.a.2

What this deterministic partition mapping enables is the distribution of the messages that are subscribed to using a single subscriber (on neworders.*) into three separate subscribers (respectively on neworders.*.0, neworders.*.1 and neworders.*.2) that can operate in parallel.

When is deterministic partitioning needed

the core NATS queue-groups and JetStream durable consumer mechanisms to distribute messages amongst a number of subscribers are partition-less and non-deterministic, meaning that there is no guarantee that two sequential messages published on the same subject are going to be distributed to the same subscriber. While in most use cases a completely dynamic, demand-driven distribution is what you need, it does come at the cost of guaranteed ordering because if two subsequent messages can be sent to two different subscribers which would then both process those messages at the same time at different speeds (or the message has to be re-transmitted, or the network is slow, etc...) and that could result in potential 'out of order' message delivery.

This means that if the application requires strictly ordered message processing, you need to limit distribution of messages to 'one at a time' (per consumer/queue-group, i.e. using the 'max acks pending' setting), which in turns hurts scalability because it means no matter how many workers you have subscribed only one at a time is doing any processing work.

Being able to evenly split (i.e. partition) subjects in a deterministic manner (meaning that all the messages on a particular subject are always mapped to the same partition) allows you to distribute and scale the processing of messages in a subject stream while still maintaining strict ordering per subject.

Another reason to need deterministic mapping is in the extreme message rates scenarios where you are reaching the limits of the throughput of incoming messages into a stream capturing messages using a wildcard subject. This limit can be ultimately reached at very high message rates due to the fact that a single nats-server process is acting as the RAFT leader (coordinator) for any given stream and can therefore become a limiting factor. In that case, distributing (i.e. partitioning) that stream into a number of smaller streams (each one with their own RAFT leader and therefore all these RAFT leaders are spread over all of the JetStream-enabled nats-servers in the cluster rather than a single one) in order to scale.

Yet another use case where deterministic partitioning can help is if you want to leverage local data caching of data (context or potentially heavy historical data for example) that the subscribing process need to access as part of the processing of the messages.

Weighted Mappings for A/B Testing or Canary Releases

Traffic can be split by percentage from one subject to multiple subjects. Here's an example for canary deployments, starting with version 1 of your service.

Applications would make requests of a service at myservice.requests. The responders doing the work of the server would subscribe to myservice.requests.v1. Your configuration would look like this:

  myservice.requests: [
    { destination: myservice.requests.v1, weight: 100% }
  ]

All requests to myservice.requests will go to version 1 of your service.

When version 2 comes along, you'll want to test it with a canary deployment. Version 2 would subscribe to myservice.requests.v2. Launch instances of your service.

Update the configuration file to redirect some portion of the requests made to myservice.requests to version 2 of your service.

For example the configuration below means 98% of the requests will be sent to version 1 and 2% to version 2.

    myservice.requests: [
        { destination: myservice.requests.v1, weight: 98% },
        { destination: myservice.requests.v2, weight: 2% }
    ]

Once you've determined Version 2 is stable you can switch 100% of the traffic over to it and you can then shutdown the version 1 instance of your service.

Traffic Shaping in Testing

Traffic shaping is also useful in testing. You might have a service that runs in QA that simulates failure scenarios which could receive 20% of the traffic to test the service requestor.

myservice.requests.*: [{ destination: myservice.requests.$1, weight: 80% }, { destination: myservice.requests.fail.$1, weight: 20% }

Artificial Loss

Alternatively, introduce loss into your system for chaos testing by mapping a percentage of traffic to the same subject. In this drastic example, 50% of the traffic published to foo.loss.a would be artificially dropped by the server.

foo.loss.>: [ { destination: foo.loss.>, weight: 50% } ]

You can both split and introduce loss for testing. Here, 90% of requests would go to your service, 8% would go to a service simulating failure conditions, and the unaccounted for 2% would simulate message loss.

myservice.requests: [{ destination: myservice.requests.v3, weight: 90% }, { destination: myservice.requests.v3.fail, weight: 8% }] the remaining 2% is "lost"

0 replies

derekcollison · 2022-05-05T01:51:23Z

derekcollison
May 5, 2022
Maintainer

Pretty awesome stuff!

0 replies

beatrausch · 2022-05-05T17:27:25Z

beatrausch
May 5, 2022
Author

Great! That's pretty much what I am looking for.

@jnmoyne Does it mean that it requires a separate nats consumer per partition? or how would a typical setup would look like, that is able to scale out.

0 replies

jnmoyne · 2022-05-05T18:45:30Z

jnmoyne
May 5, 2022
Collaborator

If you are trying to scale the strictly ordered consumption of the messages then you could simply use one stream for all the partitions (e.g. listens to foo.bar.*) and then have one durable consumer per partition (i.e. one with subject filter foo.bar.0, one with filter foo.bar.1, etc...).

If you have a very high message rate you want to capture in a stream and you are reaching the limit of performance a single stream can handle and you want to scale that stream then you would create a stream per partition (and one durable per partition stream).

2 replies

niondir Jul 18, 2024

SubjectTransform applies the transformation when a message enters a stream.

When I scale the number of partitions up and down I see several issues to be considered:
Increasing Partition Count
It's pretty easy to spawn new consumers, but the partition number will change for e.g. a single User (or what ever is the partition key)

Decreasing Partition Count
When removing a consumer and decreasing the partition count, the old (now unused) partitions still need to be considered by the remaining consumers.

What we did is to have just a fixed set of 100 partitions and distribute the partitions over multiple workers. When the worker count changes the partitions need to be re-distributed over all workers.

Are there any better solutions to handle this cases?

jnmoyne Jul 22, 2024
Collaborator

Indeed the way to provide some form of elasticity without having to 'repartition' is to consider the number of partition as the 'maximum number of consumers you will ever need' and then distribute the partitions evenly over the set of consumers that are present at any given time (meaning that if you want to change the number of consumers you need to re-distribute the partitions over the members and synchronize the consumers migrating from the old distribution to the new one).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Event Synchronization with NATS #970

{{title}}

Replies: 4 comments 2 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Event Synchronization with NATS #970

beatrausch May 4, 2022

Replies: 4 comments · 2 replies

jnmoyne May 5, 2022 Collaborator

Subject Mapping and Traffic Shaping

Simple Mapping

Subject Token Reordering

Deterministic Subject token Partitioning

When is deterministic partitioning needed

Weighted Mappings for A/B Testing or Canary Releases

Traffic Shaping in Testing

Artificial Loss

derekcollison May 5, 2022 Maintainer

beatrausch May 5, 2022 Author

jnmoyne May 5, 2022 Collaborator

niondir Jul 18, 2024

jnmoyne Jul 22, 2024 Collaborator

beatrausch
May 4, 2022

Replies: 4 comments 2 replies

jnmoyne
May 5, 2022
Collaborator

derekcollison
May 5, 2022
Maintainer

beatrausch
May 5, 2022
Author

jnmoyne
May 5, 2022
Collaborator

jnmoyne Jul 22, 2024
Collaborator