Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

log something useful when dry-running cluster scaling policies #621

Closed
wants to merge 2 commits into from

Conversation

erulabs
Copy link

@erulabs erulabs commented Mar 1, 2023

When using dry-run for cluster autoscaling, it's easy to get a bit confused because we see no output whatsoever!

The change means that AWS/GKE/Azure cluster autoscaling will print an INFO log letting us know what would have happened.

@erulabs erulabs requested review from jrasell and lgfa29 as code owners March 1, 2023 00:11
@hashicorp-cla
Copy link

hashicorp-cla commented Mar 1, 2023

CLA assistant check
All committers have signed the CLA.

@erulabs
Copy link
Author

erulabs commented Mar 1, 2023

Ah, it seems that I should have tried TRACE level logging first:

// Log at trace level the details of the strategy calculation. This is
// helpful in ultra-debugging situations when there is a need to understand
// all the calculations made.
s.logger.Trace("calculated scaling strategy results",
"check_name", eval.Check.Name, "current_count", count, "new_count", newCount,
"metric_value", metric.Value, "metric_time", metric.Timestamp, "factor", factor,
"direction", eval.Action.Direction)

I'd suggest that this is probably the most important thing nomad-autoscaler could log! Particularly in dry-run mode, it's very useful to know what would happen and why.

I'll change these to INFO level logs and that ought to be enough for me!

@erulabs erulabs closed this Mar 10, 2023
@lgfa29
Copy link
Contributor

lgfa29 commented Mar 10, 2023

Hi @erulabs 👋

Thanks for the PR, but yeah, we have log a lot of information at TRACE level because the Autoscaler can be quite chatty depending on how many policies you have and how often they are evaluated.

But I agree that they are easy to miss, specially since the default level is INFO. I created #624 to describe the problem and also propose some options.

Feel free to add any more comments there or take on it if you are interested 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants