Adding benchmarking utility to rpk #3929

marcofaggian · 2022-03-02T18:48:03Z

marcofaggian
Mar 2, 2022

travisdowns · 2022-03-03T23:31:55Z

travisdowns
Mar 3, 2022
Maintainer

Yes, we are definitely interested in a benchmarking utility which overcomes some of the limitations of existing tools. It's worth noting what's out there too: you've mentioned perf-tools, but have you looked also at open messaging benchmark and librdkafka rdkafka_performance?

Some things we'd be looking for in a benchmarking tool, in addition to what you've mentioned :

Avoids the "coordinated omission" problem, especially when testing latency. This probably takes the form of having an explicit schedule for messages calculated up-front or on-demand (but independent of existing message submit/complete timing), rather than basing it on relative sleeps. This means setting a given rate and if that rate can't be achieved, the producer will fall behind without bound and the test should probably fail.
Allow extracting key statistics (as you've mentioned), optionally in a machine-readable format like JSON. In particular, we'd want more than just latency: we'd want the client's idea of how much latency was incurred on the client (and why) and how much was on the "other side" of the network request (i.e., related to server latencies or network delays).
Should also tuning of relevant client settings.
Allow a variety of consumer scenarios, such as using consumer groups, or hitting a specific partition.
Cover at least a few of most interesting performance scenarios, such as compacted topics.

Ultimately, a "proper" benchmark will probably involve many clients sending messages to a large cluster. Setting this all up is probably mostly outside the scope of rpk itself, so I don't know if we would want to gold-plate everything in terms of analysis and statistics since real-world tests are going to need additional scaffolding around it which would also change the analysis part.

1 reply

emaxerrno Mar 4, 2022
Maintainer

I think submitting an RFC with what travis mentioned would be the next steps

twmb · 2022-03-04T17:20:39Z

twmb
Mar 4, 2022
Maintainer

I think there is a need for two benchmarking utilities: one that an individual command can generate in-process, and one that orchestrates a more advanced suite. An in-process CLI can be used to generate workloads for clusters, or to test some simpler aspects of throughput & generate higher level numbers. A more advanced suite would entail the aspects that @travisdowns brought up.

The simpler once can basically be something that is similar to franz-go, or librdkafka's bench, or Java's kafka-console-producer. To me it seems a definite yes that an in-process, single producer / consumer command can be useful, especially because there are already so many cli tools providing exactly this. We want this in rpk.

I'm not sure if rpk should contain the more advanced spin-up-large-servers, spin-up-clients aspect. That would be quite a lot, and may need customizing for every use case. If it's possible to fit into rpk, that'd be great, but this would require more planning.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding benchmarking utility to rpk #3929

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments 1 reply

{{title}}

{{title}}

{{title}}

Select a reply

Adding benchmarking utility to rpk #3929

marcofaggian Mar 2, 2022

Replies: 2 comments · 1 reply

travisdowns Mar 3, 2022 Maintainer

emaxerrno Mar 4, 2022 Maintainer

twmb Mar 4, 2022 Maintainer

marcofaggian
Mar 2, 2022

Replies: 2 comments 1 reply

travisdowns
Mar 3, 2022
Maintainer

emaxerrno Mar 4, 2022
Maintainer

twmb
Mar 4, 2022
Maintainer