-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GetTopicConfigs takes much longer to return since PR #97 #132
Comments
Thinking about, maybe it makes sense to cache topic config like metadata is https://github.com/cloudhut/kminion/blob/cc3d571a471a772b9e5b2a76830c9fb234f2775f/prometheus/collect_topic_info.go#L12 Topic configs won't change often. |
Hey @rmb938 , I wasn't aware that requesting more/all topic configs has such a large impact on the performance for clusters with large numbers of topics. We will figure out something! Making it configurable via a config could be an option for sure, unsure if we should reuse the config from the infoMetric because it's been created specifically for one metric series. Good idea though! |
I second this! Thanks for filing this @rmb938! |
Thanks @weeco , I completely understand being busy and have no issue staying on 2.1.0 until this is fixed. @amuraru Yea it is a bit weird and I'm not 100% sure your PR caused the issue but it was the only one that seemed to make any change related to querying Kafka, however using a version from before your PR was merged in doesn't have the same performance issue. It may make sense to "make metrics for the metrics" i.e add timer metrics to the Kakfa function calls so we could easily measure and record these types of perf issues. |
@weeco Any progress on fixing this? I see there was a PR opened a while ago but it was closed? |
In version 2.1.0 the
GetTopicConfigs
was limited to just thecleanup.policy
config https://github.com/cloudhut/kminion/blob/v2.1.0/minion/describe_topic_config.go#L22However in PR #97 the limitation was removed https://github.com/cloudhut/kminion/pull/97/files#diff-ea87a33952172ffd5cd3de385e172c2398f88c1c76808a63919c6bed164fa4a8L23
In Kafka clusters with tens of thousands of topics this starts taking a very long time to return and eventually hits the 60 second timeout here https://github.com/cloudhut/kminion/blob/master/prometheus/exporter.go#L224 and/or promethues scrape timeouts.
Can the
GetTopicConfigs
function be modified to use the config optioninfoMetric.configKeys
so it doesn't try to get all config items for the topics?The text was updated successfully, but these errors were encountered: