Skip to content

Scheduled downtime for host+services not always being honored #224

@user-name-cannot-be-blank

Description

We set scheduled downtime for a host and all services for a maintenance window last night, but we still got some notifications; unfortunately one of the contacts was not performing maintenance and received those notifications after hours. The notifications were just some (only 11 out of 104 service checks) of the OK messages when the host was back up, but I think something is not working as intended.

Here are some (slightly redacted) sample entries from the Thruk and Naemon logs that show this -

# Thruk log
[2025/08/27 21:08:37][monitoring_host][INFO] [external_command][user1][dc49ff89c5ce4634b4227be42a77317eb203d838ff1fb1409a19c08527aff275] [monitoring_host] cmd: COMMAND [1756354116] SCHEDULE_HOST_SVC_DOWNTIME;target_host;1756354080;1756364280;1;0;7200;User1 Name;Downtime Comment
# from 09:08PM to 11:58PM

# Naemon log
# Wed Aug 27 09:33:19 PM PDT 2025
[1756355599] SERVICE ALERT: target_host;MSDB_VersionGenerationRate;OK;HARD;4;OK: VersionGenerationRate = 0
[1756355599] SERVICE NOTIFICATION: user2;target_host;MSDB_VersionGenerationRate;OK;service-notify-by-email;OK: VersionGenerationRate = 0
[1756355599] SERVICE NOTIFICATION: user3;target_host;MSDB_VersionGenerationRate;OK;notify-service-by-telegram;OK: VersionGenerationRate = 0

# Wed Aug 27 09:33:47 PM PDT 2025
[1756355627] SERVICE ALERT: target_host;MSSQL_Latches_Wait_Time;CRITICAL;HARD;4;CRITICAL - DBI connect('Driver=ODBC Driver 18 for SQL Server:Server=target_host.local.io:Database=:TrustServerCertificate=yes:MultiSubnetFailover=Yes:',...) failed: [Microsoft][ODBC Driver 18 for SQL Server]Login timeout expired (SQL-HYT00) [state was HYT00 now 08001]
[1756355627] SERVICE NOTIFICATION SUPPRESSED: target_host;MSSQL_Latches_Wait_Time;Notification blocked for object currently in a scheduled downtime.
[1756355627] SERVICE ALERT: target_host;MSSQL_Page_Life_Expectancy;CRITICAL;HARD;4;CRITICAL - DBI connect('Driver=ODBC Driver 18 for SQL Server:Server=target_host.local.io:Database=:TrustServerCertificate=yes:MultiSubnetFailover=Yes:',...) failed: [Microsoft][ODBC Driver 18 for SQL Server]Login timeout expired (SQL-HYT00) [state was HYT00 now 08001]
[1756355627] SERVICE NOTIFICATION SUPPRESSED: target_host;MSSQL_Page_Life_Expectancy;Notification blocked for object currently in a scheduled downtime.

# Wed Aug 27 09:34:27 PM PDT 2025
[1756355667] SERVICE ALERT: target_host;Windows_Service_SQL_Engine_Service;CRITICAL;HARD;4;CRITICAL: SQL Server (MSSQLSERVER): starting delayed ()
[1756355667] SERVICE NOTIFICATION SUPPRESSED: target_host;Windows_Service_SQL_Engine_Service;Notification blocked for object currently in a scheduled downtime.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions