Home · Briefs · CTI Weekly Summary — 2026-W19 (May 04 – May 10, 2026)
DENIC .de DNSSEC outage — 3.5 h registry-side trust failure traced to keytag 33834 collision and an alerting-layer fire-without-page
From CTI Weekly Summary — 2026-W19 (May 04 – May 10, 2026) · published 2026-05-11
On 2026-05-05 starting approximately 19:30 UTC (per Cloudflare's recorded incident-start timestamp), DENIC (the .de registry) began distributing invalid DNSSEC signatures for the .de TLD, making .de TLD resolution fail across DNSSEC-validating resolvers for roughly 3.5 hours; Cloudflare's write-up describes potential impact on "millions of domains" without quantifying the count. The 2026-05-08 post-mortem confirmed the root cause: a code defect in DENIC's third-generation custom signing infrastructure (deployed April 2026 atop Knot DNS) generated three private key pairs all assigned the same Key Tag (33834) during a routine Zone-Signing-Key rotation, while only one corresponding public DNSKEY record was published to the zone. RRSIG records signed by the two unpublished keys were therefore unvalidatable; resolvers marked all .de delegations as "Bogus", and the bogus NSEC3 trust path also took down resolution for non-DNSSEC-signed .de domains. Cloudflare deployed an RFC 7646 Negative Trust Anchor for its resolvers at 22:17 UTC — a roughly 2-hour-47-minute mitigation gap from the recorded incident start. Critically, DENIC notes the monitoring pipeline detected anomalous resolver behaviour but the alerting layer did not correctly forward the alerts — a fire-without-page failure. Knot DNS itself is not implicated; the bug was in DENIC's automation layer (DENIC analysis blog, 2026-05-08 · Cloudflare blog · heise online, 2026-05-08 · daily 2026-05-09 · daily 2026-05-10 post-mortem UPDATE). Defender takeaway: DNSSEC registry-side errors are indistinguishable from attacker-induced trust failures from a resolver's perspective. Validating-resolver operators in DACH and EU public-sector environments should keep RFC 7646 Negative Trust Anchor capability live for continuity during registry incidents and ensure runbooks separate "registry KSK/ZSK rollover defect" from "zone-level attack on a downstream domain". The cross-finding for incident-response leaders is more general: alerting-pipeline reliability is itself a critical-infrastructure component, and a monitored anomaly that doesn't page is functionally an unmonitored anomaly.