Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce or remove MON and OSD alerts during the maintenance #120

Open
megian opened this issue Apr 28, 2023 · 0 comments
Open

Reduce or remove MON and OSD alerts during the maintenance #120

megian opened this issue Apr 28, 2023 · 0 comments
Labels
enhancement New feature or request

Comments

@megian
Copy link
Contributor

megian commented Apr 28, 2023

Context

The maintenance causes MON and OSD to be restarted.
This is a regular process and no issue, as long as just a qualified amount of components are down at the same time.

Current state is that we get P1 alerts out of MON and OSDs down caused by the regular maintenance process.
This is misleading the operator, because it it not an actionable alert, recover automatically as the maintenance processes.

Implementation idea

  • Relax the alerts, so they are P3 rather than P1. This still causes noice.
  • Relax the time MON and OSD can be down until an alert happens. Increases the delay in a real event.
  • Figure a way MON and OSD downs are just counted, if more the the minimum amount of running services covering the service are down
@megian megian added the enhancement New feature or request label Apr 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant