Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Take host state into account when sending suppressed notifications #9348

Merged
merged 1 commit into from
Apr 20, 2022

Conversation

julianbrost
Copy link
Contributor

Checkable::FireSuppressedNotifications() compares the time of the current checkable with the last recovery time of parents to avoid notification right after a parent recovered and before the current checkable was checked.

This commit makes this check also include to host if the checkable is a service. This makes the behavior consistent with the documentation that states there is an implicit dependency on the host (which isn't realized as implicitly generating a Dependency object unfortunately).

ref/IP/38985
ref/NC/747622

Tests

Config snippet (needs standard notification apply rules):

object Host "ip-38985" {
        check_command = "dummy"
        check_interval = 5s
        retry_interval = 5s
        vars.dummy_state = {{ if (get_time() % 120 < 60) { 0 } else { 2 } }}
}

for (i in range(100)) {
        object Service i {
                host_name = "ip-38985"
                check_command = "dummy"
                check_interval = 7s
                retry_interval = 7s
                vars.dummy_state = {{ if (get_time() % 120 < 60) { 0 } else { 2 } }}
        }
}

Before

Every 2 minute cycle looks something like this: first, the host receives a problem notification, then a recovery notification and after that, a bunch of random services receive a suppressed notification (not visible in the logs that it's a suppressed one, but the hard state is reached for all services well before the host recovers) and a recovery notification shortly after.

[2022-04-19 15:57:13 +0200] information/Notification: Sending 'Problem' notification 'ip-38985!dummy-host-notification' for user 'dummy'
[2022-04-19 15:58:03 +0200] information/Notification: Sending 'Recovery' notification 'ip-38985!dummy-host-notification' for user 'dummy'
[2022-04-19 15:58:05 +0200] information/Notification: Sending 'Problem' notification 'ip-38985!15!dummy-service-notification' for user 'dummy'
[2022-04-19 15:58:05 +0200] information/Notification: Sending 'Problem' notification 'ip-38985!63!dummy-service-notification' for user 'dummy'
[2022-04-19 15:58:05 +0200] information/Notification: Sending 'Problem' notification 'ip-38985!82!dummy-service-notification' for user 'dummy'
[2022-04-19 15:58:05 +0200] information/Notification: Sending 'Problem' notification 'ip-38985!72!dummy-service-notification' for user 'dummy'
[2022-04-19 15:58:05 +0200] information/Notification: Sending 'Problem' notification 'ip-38985!30!dummy-service-notification' for user 'dummy'
[2022-04-19 15:58:05 +0200] information/Notification: Sending 'Problem' notification 'ip-38985!84!dummy-service-notification' for user 'dummy'
[2022-04-19 15:58:05 +0200] information/Notification: Sending 'Problem' notification 'ip-38985!29!dummy-service-notification' for user 'dummy'
[2022-04-19 15:58:05 +0200] information/Notification: Sending 'Problem' notification 'ip-38985!71!dummy-service-notification' for user 'dummy'
[2022-04-19 15:58:05 +0200] information/Notification: Sending 'Problem' notification 'ip-38985!21!dummy-service-notification' for user 'dummy'
[2022-04-19 15:58:05 +0200] information/Notification: Sending 'Problem' notification 'ip-38985!4!dummy-service-notification' for user 'dummy'
[2022-04-19 15:58:05 +0200] information/Notification: Sending 'Problem' notification 'ip-38985!7!dummy-service-notification' for user 'dummy'
[2022-04-19 15:58:05 +0200] information/Notification: Sending 'Recovery' notification 'ip-38985!63!dummy-service-notification' for user 'dummy'
[2022-04-19 15:58:05 +0200] information/Notification: Sending 'Recovery' notification 'ip-38985!84!dummy-service-notification' for user 'dummy'
[2022-04-19 15:58:05 +0200] information/Notification: Sending 'Recovery' notification 'ip-38985!72!dummy-service-notification' for user 'dummy'
[2022-04-19 15:58:05 +0200] information/Notification: Sending 'Recovery' notification 'ip-38985!30!dummy-service-notification' for user 'dummy'
[2022-04-19 15:58:05 +0200] information/Notification: Sending 'Recovery' notification 'ip-38985!82!dummy-service-notification' for user 'dummy'
[2022-04-19 15:58:05 +0200] information/Notification: Sending 'Recovery' notification 'ip-38985!15!dummy-service-notification' for user 'dummy'
[2022-04-19 15:58:06 +0200] information/Notification: Sending 'Recovery' notification 'ip-38985!7!dummy-service-notification' for user 'dummy'
[2022-04-19 15:58:06 +0200] information/Notification: Sending 'Recovery' notification 'ip-38985!29!dummy-service-notification' for user 'dummy'
[2022-04-19 15:58:06 +0200] information/Notification: Sending 'Recovery' notification 'ip-38985!71!dummy-service-notification' for user 'dummy'
[2022-04-19 15:58:06 +0200] information/Notification: Sending 'Recovery' notification 'ip-38985!21!dummy-service-notification' for user 'dummy'
[2022-04-19 15:58:06 +0200] information/Notification: Sending 'Recovery' notification 'ip-38985!4!dummy-service-notification' for user 'dummy'

After

Even when letting this run for a a longer time, only the host receives notifications:

[2022-04-19 16:23:11 +0200] information/Notification: Sending 'Problem' notification 'ip-38985!dummy-host-notification' for user 'dummy'
[2022-04-19 16:24:01 +0200] information/Notification: Sending 'Recovery' notification 'ip-38985!dummy-host-notification' for user 'dummy'
[2022-04-19 16:25:11 +0200] information/Notification: Sending 'Problem' notification 'ip-38985!dummy-host-notification' for user 'dummy'
[2022-04-19 16:26:01 +0200] information/Notification: Sending 'Recovery' notification 'ip-38985!dummy-host-notification' for user 'dummy'
[2022-04-19 16:27:11 +0200] information/Notification: Sending 'Problem' notification 'ip-38985!dummy-host-notification' for user 'dummy'
[2022-04-19 16:28:01 +0200] information/Notification: Sending 'Recovery' notification 'ip-38985!dummy-host-notification' for user 'dummy'
[2022-04-19 16:29:11 +0200] information/Notification: Sending 'Problem' notification 'ip-38985!dummy-host-notification' for user 'dummy'
[2022-04-19 16:30:01 +0200] information/Notification: Sending 'Recovery' notification 'ip-38985!dummy-host-notification' for user 'dummy'
[2022-04-19 16:31:11 +0200] information/Notification: Sending 'Problem' notification 'ip-38985!dummy-host-notification' for user 'dummy'
[2022-04-19 16:32:01 +0200] information/Notification: Sending 'Recovery' notification 'ip-38985!dummy-host-notification' for user 'dummy'
[2022-04-19 16:33:11 +0200] information/Notification: Sending 'Problem' notification 'ip-38985!dummy-host-notification' for user 'dummy'
[2022-04-19 16:34:01 +0200] information/Notification: Sending 'Recovery' notification 'ip-38985!dummy-host-notification' for user 'dummy'

Checkable::FireSuppressedNotifications() compares the time of the current
checkable with the last recovery time of parents to avoid notification right
after a parent recovered and before the current checkable was checked.

This commit makes this check also include to host if the checkable is a
service.  This makes the behavior consistent with the documentation that states
there is an implicit dependency on the host (which isn't realized as implicitly
generating a Dependency object unfortunately).
@julianbrost julianbrost added bug Something isn't working area/notifications Notification events ref/IP ref/NC consider backporting Should be considered for inclusion in a bugfix release labels Apr 19, 2022
@julianbrost julianbrost added this to the 2.14.0 milestone Apr 19, 2022
@julianbrost julianbrost requested a review from Al2Klimov April 19, 2022 14:41
@cla-bot cla-bot bot added the cla/signed label Apr 19, 2022
@Al2Klimov Al2Klimov merged commit 67dfefe into master Apr 20, 2022
@icinga-probot icinga-probot bot deleted the bugfix/suppressed-notifications-host-dependency branch April 20, 2022 09:18
@julianbrost
Copy link
Contributor Author

For now, only backporting this to 2.13. 2.12 would need some extra testing and I think this might have an implicit dependency on #9207, in particular this part.

@julianbrost julianbrost added backported Fix was included in a bugfix release and removed consider backporting Should be considered for inclusion in a bugfix release labels Jun 14, 2022
yhabteab pushed a commit that referenced this pull request Sep 5, 2022
…host-dependency

Take host state into account when sending suppressed notifications
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/notifications Notification events backported Fix was included in a bugfix release bug Something isn't working cla/signed ref/IP ref/NC
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants