Downtime on a removed object are never closed. #10303

w1ll-i-code · 2025-01-15T16:17:58Z

Describe the bug

If a object with a Downtime gets disabled (even just temporary) the end of the associated Downtime is never written out to the IDO / IcingaDB.

To Reproduce

Create a host in the director and deploy it.
Create a downtime on the host
Use the director to roll back to an older version
4.Redeploy the new version

Expected behavior

I would expect the Downtime to be terminated once the object is deactivated (The actual_end_time set to the current time). But since the downtime is dropped without ever setting this field, the object looks in the reports as if it where in a constant downtime. which does not correspond to the internal state of icinga2.

Screenshots

w1ll-i-code · 2025-01-15T16:19:20Z

Here is my proposed solution: Whenever a object gets removed, all the currently active downtimes get closed as well.

w1ll-i-code · 2025-01-16T07:41:14Z

I am willing to implement the change myself, but I'd like to coordinate with you first, so my proposed solution is the right approach. Since the downtimes are dropped afterwards from the icinga2.state file, this seems like the most reasonable solution to me. I'd prefer it the downtimes would persist through the deploys, but that'd be a more invasive change I don't feel comfortable with implementing myself.

yhabteab · 2025-01-20T12:47:47Z

I would expect the Downtime to be terminated once the object is deactivated (The actual_end_time set to the current time)

There is no such thing as deactivate downtime when a new version of the configuration is deployed via Icinga Director. When the host the downtimes belong to does not exist in the newly deployed configuration, then the downtimes become dangling objects that Icinga 2 cannot map to their respective host/service, and they will not even survive the config validation. However, since they are created with the ignore_on_errror flag, they will not stop Icinga 2 from loading the other configurations and once Icinga 2 is done loading/validating the other configuration, it will simply erase them from disk.

Here is my proposed solution: Whenever a object gets removed, all the currently active downtimes get closed as well.

If you don't mind wasting time on something that can't be fixed, then go ahead, but bear in mind that this is simply impossible to fix right now. Once the corresponding downtime host/service object is gone, the downtime object itself becomes pretty much useless and is not even a valid object anymore. If you don't want such strange history views, I suggest to manually clear the downtimes before removing the host/service object via Icinga Director.

w1ll-i-code · 2025-01-20T15:18:30Z

If you don't mind wasting time on something that can't be fixed, then go ahead, but bear in mind that this is simply impossible to fix right now.

I already wasted that time and I already implemented my solution. It seems to work for mariadb/mysql, but I need to test it for pgsql and icingadb as well. But I'll probably have to do a second take to make it completely correct.

it will simply erase them from disk.

I am well aware of that. That's the problem we are currently facing. It happens often, but randomly enough that cleaning it up manually for all objects that may be affected by it is not feasible. Mostly we notice that once the SLA uptime report is generated and a host is completely out of bounds, as the downtime was not handled correctly. If we trigger a OnDowntimeRemoved before it gets erased from disk, that solution already works for us.

w1ll-i-code · 2025-01-20T15:24:08Z

The logic I am thinking of is this:

The configuration for the object gets removed, it is no longer active.
The object still exists in the icinga2.state file together with the downtime.
The config gets loaded and the object gets set to inactive.
The inactive object gets synced to the IDO
1. Here I propose to also trigger the OnDowntimeRemoved hook for each downtime associated with the host.
The host and downtime are now inactive and will not get synced to disk in the icinga2.state file anymore. (Or just the host, not sure, but the effect is the same.)

Lmk if I have any holes in my understanding here, but from what I can observe rn, this is whats happening.

w1ll-i-code linked a pull request Jan 21, 2025 that will close this issue

Remove downtimes when objects are deactivated. #10311

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Downtime on a removed object are never closed. #10303

Downtime on a removed object are never closed. #10303

w1ll-i-code commented Jan 15, 2025

w1ll-i-code commented Jan 15, 2025

w1ll-i-code commented Jan 16, 2025

yhabteab commented Jan 20, 2025

w1ll-i-code commented Jan 20, 2025

w1ll-i-code commented Jan 20, 2025

Downtime on a removed object are never closed. #10303

Downtime on a removed object are never closed. #10303

Comments

w1ll-i-code commented Jan 15, 2025

Describe the bug

To Reproduce

Expected behavior

Screenshots

w1ll-i-code commented Jan 15, 2025

w1ll-i-code commented Jan 16, 2025

yhabteab commented Jan 20, 2025

w1ll-i-code commented Jan 20, 2025

w1ll-i-code commented Jan 20, 2025