Skip to content

Resolved Alerts

By default, kpulse sends a second notification when a previously firing alert clears. The message is prefixed [RESOLVED], uses a green color, and reuses the same dedupe key as the original alert.

This applies to state-based monitors — the ones where "the condition no longer holds" is meaningful. Discrete one-shot monitors (warning events, pod restarts, job failed) do not resolve and are unaffected.

Toggle

resolution:
  enabled: true   # default

Set to false to keep the v0.1 behavior (fire only, never resolve).

Which monitors resolve

Monitor Resolves on
pod_crashes Container transitions to Running, or Pod deleted
pvc_usage Usage drops back below warn_at
node_disk Usage drops back below warn_at
node_conditions Condition no longer reported True (or Ready=True for NotReady)
tls_cert_expiry Cert renewed (NotAfter pushed past warn window)
rollout_stuck Deployment Progressing=True, or StatefulSet ready==replicas
hpa_at_max HPA scales below maxReplicas
daemonset_unscheduled All desired pods ready
cronjob_missed Next scheduled run lands

Which monitors do NOT resolve

  • warning_events — events are point-in-time; "stopped firing" isn't a concept
  • pod_restarts — a restart burst is a discrete event
  • job_failed — Jobs don't unfail (a re-run would be a new Job)

What a resolved alert looks like

Channel Resolved format
Slack :white_check_mark: *[RESOLVED]* [cluster] ...
Email green top bar, [RESOLVED] in subject, X-Kpulse-State: resolved header
Webhook "state": "resolved" in JSON body
Discord [OK] **[RESOLVED]** [cluster] ...
Teams green theme color, [RESOLVED] in title

Behavior notes

  • Resolved alerts bypass the dedupe window (always send) and bypass the digest (always immediate).
  • The dedupe key matches the original alert: (monitor, namespace, kind, name, reason). If the original alert never fired (or was suppressed by dedupe before reaching the engine), no resolved message is sent.
  • The active-alert set is in-memory. If kpulse restarts, it won't know about alerts that were firing before the restart, so it won't send resolved messages for them. (The next scan from a periodic monitor will re-establish state and send a fresh firing alert if the condition still holds.)

When to turn it off

  • You forward everything into Alertmanager / PagerDuty and those tools already track alert state.
  • You want strict "fire and forget" semantics for compliance.