Architecture
A single, end-to-end map of every component in the repository: what it is, what it reads, what it writes, and how it talks to the others. If you are debugging an unexpected commit or onboarding a new operator, start here.
One picture
┌──────────────────────────┐
│ Claude Code routine │
│ (cloud, scheduled) │
│ prompt: "Read │
│ prompts/daily-cti- │
│ brief.md and execute" │
└─────────────┬────────────┘
│ git push
▼
reads ┌──────────────────────────────────────────────────────────┐
──────► │ repository │
│ │
│ prompts/ state/ │
│ ├ daily-cti-brief.md ├ covered_items.json │
│ ├ weekly-summary.md ├ cves_seen.json │
│ ├ CHANGELOG.md ├ deep_dive_history.json │
│ ├ verification.md └ run_log.json │
│ ├ brief-template.md sources/ │
│ └ check-brief-fixes.md └ sources.json │
│ tools/ │
│ briefs/ ├ check_brief.py (Phase 5.5) │
│ ├ YYYY-MM-DD.md └ fetch_source.py │
│ └ weekly/YYYY-Www.md docs/ │
│ ├ architecture.md (this file) │
│ .claude/agents/ ├ operating.md │
│ ├ cti-research.md └ analytics.md │
│ └ cti-verification.md │
└──────────────────────────────┬───────────────────────────┘
│
│ git push (claude/** branches only)
▼
┌────────────────────────────┐
│ auto-merge-claude.yml │
│ ff-merges (or merges with │
│ state/* → ours, │
│ sources.json → theirs) │
└────────────┬───────────────┘
▼
main
│
▼ workflow_run (success only)
┌────────────────────────────┐
│ deploy-site.yml │
│ runs site/build.py │
│ force-pushes to gh-pages │
└────────────┬───────────────┘
▼
GitHub Pages reader
(real HTML pages emitted
by site/build.py — no SPA)
Components
prompts/ — everything the routine loads at runtime
The two master prompts plus the runtime-policy / template / debug docs they reference. Each master prompt is the entire runtime contract for a routine; the routine is invoked with a one-line wrapper ("Read this prompt and execute it"). The supporting files are also under prompts/ because the master prompts Read them at runtime — they are part of the prompt machinery, not operator-facing documentation.
prompts/daily-cti-brief.md— the daily brief. Phases 0–6 + a 5.5 self-check gate (preflight → parallel research → verification → deep dive → compose → state update → self-check → commit/push). Spawns four parallel research sub-agents.prompts/weekly-summary.md— the weekly consolidating summary (12 sections, 0–11). Reads the past 7 days of dailies, runs a Phase 2.5 verification & triage pass, then composes via two horizon sub-agents (W1 long-running campaigns + threat-actor developments + research findings + annual reports; W2 policy + regulatory). Same procedure as the daily (gold standard); the lens differs — broader threat picture, multi-day chains, research / actor developments, annual reports, long-horizon "looking ahead". The weekly may repeat a daily item with a new lens; the daily never repeats the weekly and carries no long-horizon synthesis.prompts/CHANGELOG.md— the version history of the prompts. Treat as the audit trail for editorial-policy changes.prompts/verification.md— the editorial / fake-news verification policy. The agent's quality gates are derived from this; the prompt's Phase 2 references it by name.prompts/brief-template.md— the canonical Markdown skeleton for the rendered brief / weekly. The prompt's Phase 4Reads it before composing.prompts/check-brief-fixes.md— fix recipes for commontools/check_brief.pyFAILs. The prompt's Phase 5.5 references it for remediation.
.claude/agents/ — custom sub-agent definitions
cti-research.md— isolated context, per-role model bound by the agent definition's YAML frontmatter (operator rebindable). Phase 1 (daily) / Phase 2 (weekly) parallel research workers; also reused for verification follow-ups (max 3 per iteration). Embeds theWebFetchoutbound-links template, thetools/fetch_source.pycontract for known-403 hosts, the discovery-trace return format, the mandatory**Model:**self-identification line. v2.47 additions: env-var self-identification (readsCLAUDE_FRIENDLY_NAME/CLAUDE_MODEL_IDset by the harness, falls back to runtime-context reasoning), prior- coverage dedup at fetch time (readswork/<run-id>/prior_coverage.jsonbefore fetching to avoid spending wall-clock on already-covered items), URL-liveness ledger append (one TSV line per successful Source fetch towork/<run-id>/url-liveness.tsvsotools/check_brief.pycan skip redundant HEAD/GET).cti-verification.md— read-only, isolated context, per-role model bound by the agent definition's frontmatter (Opus by default since v2.46 — gatekeeper of the publish gate). Phase 5.7 (daily) / Phase 4.7 (weekly) cold-reader verifier, runs AFTERtools/check_brief.pyexits 0 (cheap mechanical gate first), looped iteratively (cap 5, fresh spawn each time, no shared memory; each iteration re-runscheck_brief.pybetween fix and re-spawn). Same self-identification contract. v2.47 additions: F12 single-source-flag finding category promoted to numbered finding; iteration-rotation note (don't assume same model as prior iteration); env-var self-identification.cti-verification-alt.md— v2.47 Sonnet-pinned variant ofcti-verification. Byte-identical operational system prompt; only the YAMLmodel:frontmatter differs (sonnetvsopus). The Phase 5.7 / Phase 4.7 main-agent loop spawns this on even iterations (iter 2, iter 4) so model-specific blind spots are caught when the next iteration runs on a different model. The two verifier definitions move in lockstep — when you edit one, edit the other.
briefs/ — the canonical output
One Markdown file per day at briefs/YYYY-MM-DD.md, one per ISO week at
briefs/weekly/YYYY-Www.md. Sections 0–8 per the structure pinned in
briefs/README.md:
0 TL;DR · 1 Immediate Actions (often absent) · 2 Active Threats / Trending
Actors / Notable Incidents & Disclosures · 3 Trending Vulnerabilities ·
4 Research & Investigative Reporting · 5 Updates to Prior Coverage ·
6 Deep Dive · 7 Action Items · 8 Verification Notes. Each individual H3
item carries a structured metadata footer (— *Source: … · Tags: … ·
Region: … [· CVE: …] [· CVSS: …] [· Vector: …] [· Auth: …] [· Status: …]*)
parseable by the build. These files are immutable once committed —
corrections happen in the next brief, not by editing past ones.
state/ — rolling memory across runs
The agent re-reads these every run before writing.
state/covered_items.json— full coverage records for every CVE / actor / campaign / incident / tool / annual report ever referenced. Each item has a structuredappearances[]array ({date, section, brief_path, delta_summary}) — the site uses this to render the Story timeline on each/entities/<key>/page. CVE-only entries (incves_seen.jsonbut not yet promoted to a topic) synthesise a stub timeline from their flat brief-name list so every entity has a coverage timeline regardless of which state file carries it.state/cves_seen.json— flat fast-lookup CVE index for sub-agent dedup. A subset ofcovered_items.json(CVEs only) with a tighter schema.state/deep_dive_history.json— rolling 30-day list of{date, topic, category}entries used by Phase 3 to apply the deep-dive category-rotation rule.state/run_log.json— rolling 90-day per-run record:run_id(v2.47: deterministic<YYYY-MM-DD>-<sha8 of brief_path|started_minute>— idempotent retry), model, sub-agent source allocation (sources_attempted/sources_used/items_returnedper S1–S4),fetch_failures,items_published,deep_dive,verification.iterations[](per-iteration model + verdict + truth/editorial/advisory finding counts),verification_iterations,verification_residual_count(v2.47: derived from final-iterationtruth + editorialwhen verdict is NEEDS_FIXES;0when CLEAN — the v2.47 cap-breach signal builds on this), andsources_changed(v2.62 — one{id, change, from, to, reason}persources/sources.jsonedit the run made: status transitions, new candidates, and fetch-method / category / reliability / url corrections). Surfaced on the operations dashboard at/ops/.state/source_health.json— v2.47, written bytools/source_health.pyon a weekly GitHub Actions cron. Bounded history (12 runs ≈ 3 months at weekly cadence) of(id, status_code, latency_ms, fetched_at, class)per active source. Lets the daily routine's source-demotion logic key off a stable failing pattern instead of the day-of-week luck of its single fire. v2.62: rendered on/ops/(the "Sources" cluster's health-snapshot panel — class breakdown + any non-ok source).
sources/ — the curated source list
sources/sources.json — ~80 entries spanning
national CERTs, vendor TI, journalism, breach trackers. Schema:
{
"id": "stable-id-never-changes", // referenced from covered_items.json
"publisher": "Display name",
"url": "https://...",
"category": ["ch-eu", "vulns", ...],
"reliability": "HIGH | MEDIUM | LOW",
"language": ["en", "de", ...],
"status": "active | candidate | demoted",
"last_successful_fetch": "YYYY-MM-DD | null",
"consecutive_failures": 0,
"notes": "history of changes, dated"
}
The agent maintains this file autonomously per the lifecycle in the top-level README.
tools/ — small operator-shipped helpers
tools/fetch_source.py— stdlib-only Python bridge that re-issues HTTP requests with a current desktop-Chrome User-Agent (v2.62: Chrome 138 + the matchingSec-CH-UAclient-hint headers a real Chrome sends, so WAFs that cross-check UA ↔ client-hints stop filtering it — the bump recovereddatabreaches.netandprodaft.comin the 2026-06-20 audit). Solves the recurring 403 / 302-to-login that the routine container hits on high-signal publishers (CISA pages, the Swiss NCSC Cyber Security Hub) where the upstream WAF filters the agent's default UA. Mandatory every run for CISA + NCSC.ch — do not even attemptWebFetchon those hosts; go straight to the bridge. Structured subcommands (cisa-kev,ncsc-csh,enisa-euvd,bsi-rss/csaf,ncsc-nl,cert-eu,cert-fr,ico-uk,sec-edgar,feed,msrc) wrap the publishers whose listing pages are JS-rendered SPAs. Read-only by design: no auth, no JS execution, no third-party deps; the v2.52 host allow-list was removed in favour of the layer-3 SSRF defences (https-only, resolved-IP deny list, redirect re-validation, body-size cap).tools/check_brief.py— the institutionalised Phase 5.5 self-check gate. Stdlib-only Python script that bundles every pre-commit consistency check (state JSON parses, CVE sync, H3 footer presence and field completeness, taxonomy validation, UPDATE citations, multi-CVE / multi-source / primary-source-quality checks,tools/fetch_source.py-for-CISA/NCSC.ch enforcement,covered_items.jsonappearance heuristic,run_log.jsonOps-dashboard population,sources.jsonlast-fetched bookkeeping, IOC heuristic scan with version-string suppression) plus runs the build-side smoke tests insite/test_build.py. Imports the footer parser + taxonomy loader fromsite/build.pyso script and build agree on parsing rules. Read-only — the agent fixes drift, the script reports it. Non-zero exit aborts the commit. Maintained as part of the agent's self-evolution authority. v2.47 additions:cap-breachWARN (final-iterationNEEDS_FIXES);verification_residual_countderived from final iteration'struth + editorial; deterministicrun_idfield required + idempotent (no duplicate runs[] entries);tldr-deadline-leadWARN (PD-13 enforcement at the bullet level);aggregator-only-sourcingWARN (≥2 Sources all from news aggregators);single-source-flagWARN (single Source missing[SINGLE-SOURCE]); URL-liveness cache (skip live HEAD/GET on URLs the sub-agents already verified live inwork/<run-id>/url-liveness.tsv).tools/source_candidates.py— v2.47. Walks last 30 days of briefs, counts every outbound-link host, subtracts hosts already insources.jsonand the news-aggregator allowlist, outputs the top-N missing-but-cited domains with citation counts and brief samples. Operator runs manually to spot publishers worth promoting tostatus: candidate. Pure post-hoc analytics; no runtime cost.tools/source_health.py— v2.47, rebuilt v2.63. Periodic accessibility probe of every source (active + candidate + demoted), now probed via its actual recipe:feed(with common-path discovery) for RSS sources, the documentedtools/fetch_source.pysubcommand forapi/bridgesources (so the bridge recipes themselves are verified), and a browser-UA HEAD→GET (Chrome-138 UA, GET-retry-after-403) forwebfetch. Records(id, status, fetch_method, status_code, class, action, action_reason, fetched_at)tostate/source_health.json(schema_version 2, 12-run bounded history). The derivedaction∈none | needs-bridge | needs-demoteis what the Ops dashboard Health panel floats — only the unsolved problems. Run by thesource-health.ymlGitHub Action on Sundays at 04:30 UTC, onworkflow_dispatch, and at the end of every daily / weekly routine run (Phase 5 / Phase 4) so the snapshot is fresh every fire.
docs/ — operator-facing documentation
System reference for operators, contributors, and curious readers. Pure docs — nothing here is loaded by the prompt at runtime (that material lives under prompts/).
docs/architecture.md— this file. End-to-end map of every component.docs/operating.md— operator runbook: GitHub App setup, Pages enablement, ops dashboard, sub-agent capability ceiling, troubleshooting.docs/analytics.md— public-facing privacy disclosure (what we measure, what we don't).
.github/workflows/ — CI
auto-merge-claude.yml— triggers on push toclaude/**. The only path commits land onmain; fast-forwards when the feature branch is a strict descendant, falls back to a regular merge with auto-resolution forstate/*.json(--ours) andsources/sources.json(--theirs) on a true divergence. Deletes the feature branch on success. Belongs to the publishing chain; do not edit unless you understand the resolution rules indocs/operating.md.deploy-site.yml— triggers on push tomainwhenever the site inputs change. Runssite/build.py, uploads the bundle to GitHub Pages.source-health.yml— v2.47. Weekly cron (Sundays 04:30 UTC) +workflow_dispatch. Runstools/source_health.pyHEAD-only against every active source, commitsstate/source_health.jsondirectly tomain(state/* sits in the auto-merge auto-resolution allowlist, so a concurrent claude/* push won't race). Independent of the daily routine.
The three workflows are independent. The site is a consumer of the agent's output and never writes back.
site/ — the public reader
A stdlib-only Python static-site generator (site/build.py) emits a real
HTML page for every URL — home, every brief, every per-item block, every
entity page (CVE, actor, campaign, incident, tool, advisory, annual
report, research, technique — all rendered through one render_entity_page
function), every source page, every tag and region index, the operations
dashboard, the about pages. JavaScript only enhances (topbar search
autocomplete via data/search.json, GitHub-stars badge, brief-page filter
chips, theme cycle, copy-link, SPA-redirect bootstrap on /). With JS
disabled the site is fully readable. See site/README.md
for the internal layout. The site is read-only with respect to the rest
of the repo:
- It only reads
briefs/,state/,sources/,README.md,docs/*.md,prompts/*.md(includingCHANGELOG.md), andsite/taxonomy.yaml. - It writes nothing back — its build artifact lives entirely under
site/_site/(gitignored locally; force-pushed to thegh-pagesbranch by the CI workflow). - Every entity is canonical at
/entities/<key>/. The legacy/cves/<id>/and/topics/<key>/URLs are HTML meta-refresh redirect stubs — they still resolve, search engines see the canonical, and internal links (CVE pills, brief-page reference blocks, search results) point at the canonical directly without a redirect hop. Type-filtered overviews live at/cves/(type=cve) and/topics/(type≠cve); the unified overview at/entities/has the same Ops-style KPI strip + type-distribution donut + recent-coverage sparkline as every entity page. - It emits eleven RSS feeds: three main feeds (
/feed.xml, daily, last 30;/feed-weekly.xml, weekly, last 30;/feed-items.xml, per-item granular, last 50) plus eight sector slices (v2.47):/feed-public-sector.xml,/feed-healthcare.xml,/feed-finance.xml,/feed-energy.xml,/feed-ot-ics.xml,/feed-defense.xml,/feed-telco.xml,/feed-education.xml. Each slice is the per-item feed filtered by the relevant Sector / Tags so subscribers can subscribe to the slice they care about. All eleven use the actual git-commit timestamp of the underlying brief as<pubDate>(not midnight-of-brief-date). All eleven listed on/feeds/. - The unified search index at
_site/data/search.jsoncovers briefs, entities (every type), and sources. - The operations dashboard at
/ops/is rendered server-side fromstate/run_log.jsonat build time. The same SVG chart primitives (_ops_svg_sparkline/_ops_svg_bars/_ops_svg_donut/_ops_svg_heatmap/_ops_kpi_tile) power the entity pages and the CVE / topic / entity overview KPI strips. - v2.47 —
/trends/cross-brief threat-class trend dashboard (8 cohort sparklines bucketed by ISO week — ransomware, actively- exploited vulnerabilities, public-sector, OT/ICS, supply-chain, AI-abuse, Switzerland+Europe, nation-state);/feeds/single discovery page for all 11 RSS feeds; per-item delta<details>block inside each brief item whose CVE / topic key has more than one appearance incovered_items.json; "Editorial choices"<details>block at the bottom of each daily brief surfacing items dropped from § 7 Verification Notes; horizontal actor-timeline strip on entity pages of type actor / campaign / incident / tool / annual-report. - v2.47 —
data/site.json.github.{url,stars}populated at build time via best-effort GitHub API fetch; the topbar'swireGithubBadge()inassets/js/app.jsconsumes it to render a live star count next to the GitHub icon. Build never fails on the fetch (silently degrades to icon-only when unreachable / rate-limited).
site/taxonomy.yaml is the controlled vocabulary
for every metadata-footer value (themes / sectors / regions / nexus /
cve_types / cve_vectors / cve_auth / cve_status / sections). The build
refuses any post-cut-over item using a value not in this file.
Data flow per routine run
┌──────────────┐ preflight ┌──────────────────────────────┐
│ routine │─────────────▶│ load sources.json (active) │
│ fires │ │ load past 7 days of briefs │
│ (operator- │ │ load covered_items.json │
│ scheduled) │ │ load cves_seen.json │
│ └──────────┬───────────────────┘
│ │
▼ spawn 4 sub-agents in parallel │
┌──────────────────────────────┐ │
│ S1 active threats / vulns │ │
│ S2 CH/EU & public sector │ │
│ S3 research & journalism │ │
│ S4 incidents & disclosures │ │
└──────────┬───────────────────┘ │
│ flexible Markdown returns │
▼ │
┌──────────────────────────────┐ │
│ verify (two-source / CERT) │ │
│ dedup vs preflight context │ │
│ rank, apply deep-dive │ │
│ category-rotation rule │ │
└──────────┬───────────────────┘ │
▼ │
┌──────────────────────────────┐ │
│ Write briefs/YYYY-MM-DD.md │ │
│ (with prompt-version badge) │ │
└──────────┬───────────────────┘ │
▼ │
┌──────────────────────────────────────────────────────────────┐
│ Phase 5 — Update state/covered_items.json, state/cves_seen. │
│ json, state/deep_dive_history.json, state/run_log.json (full │
│ sub-agent allocation + fetch_failures + verification_ │
│ iterations + verification_residual_count — Ops dashboard │
│ depends on this), sources/sources.json (last-seen, demotions, │
│ candidates) │
└──────────┬───────────────────────────────────────────────────┘
▼
┌──────────────────────────────┐
│ Phase 5.5 — self-check gate │ python3 tools/check_brief.py
│ (institutionalised script): │ ─ state JSON parses
│ │ ─ every brief CVE in cves_seen
│ │ ─ core sections vs covered appear-
│ │ ances heuristic
│ │ ─ every UPDATE has inline cite
│ │ ─ every H3 has a v2 footer
│ │ (Source ≥1 link + Tags + Region)
│ │ ─ CVE entries carry CVE / Vector /
│ │ Auth / Status
│ │ ─ multi-CVE: shared CVSS or per-
│ │ CVE breakdown
│ │ ─ primary-source quality (NVD /
│ │ CERT as sole primary → WARN)
│ │ ─ tools/fetch_source.py used for
│ │ CISA + NCSC.ch when 403 hit
│ │ ─ run_log.json today fully
│ │ populated (Ops dashboard)
│ │ ─ ≥1 source fetched today
│ │ ─ heuristic IOC scan
│ │ ─ taxonomy validation
│ │ ─ site/test_build.py passes
│ exit != 0 → abort the rest │
│ of the run; brief stays on │
│ disk; next run rebuilds │
└──────────┬───────────────────┘
▼
┌──────────────────────────────┐
│ Phase 5.7 — verification │ cti-verification (Opus) loop
│ sub-agent loop (≤5 iters): │ ─ runs only after Phase 5.5 = 0
│ truth gate │ ─ each iteration: receive report,
│ ─ every URL fetched │ apply fixes, re-update state,
│ ─ every claim grounded │ re-run check_brief.py, then
│ editorial-quality gate │ re-spawn fresh verifier
│ ─ relevance to CH/EU SOC │ ─ CLEAN verdict ⇒ proceed to commit
│ ─ vendor advisory ≻ NVD/ │ ─ iteration 5 NEEDS_FIXES ⇒
│ CERT as primary │ fail-open, residuals logged
│ ─ drop low-relevance │
│ ─ deepen unclear items │
│ (≤3 follow-up subagents) │
│ ─ surface contradictions │
│ ─ pursue missed angles │
│ gate to publish; cap is │
│ safety valve, not goal │
└──────────┬───────────────────┘
▼
┌──────────────────────────────┐
│ git commit + push │ push to claude/<name> branch only;
│ │ auto-merge-claude.yml promotes to main;
│ │ deploy-site.yml rebuilds gh-pages.
└──────────────────────────────┘
The agent never bypasses any of these phases — Phase 0 is a hard prerequisite for Phase 1, Phase 5 (state update) is a hard prerequisite for Phase 5.5 (self-check gate), which is a hard prerequisite for Phase 5.7 (verification sub-agent), which is a hard prerequisite for Phase 6 (commit). If a phase fails, the prompt instructs the agent to stop and surface the error rather than silently continuing.
Adding a new component
A safe pattern for extending the system without affecting the agent:
- Site-only feature (new view, new search facet). Edit
site/. The agent's run is untouched. - New data field (e.g. add a
severitytocovered_items.json). Update the prompt's Phase 5 instructions, then re-flow the new field throughsite/build.pyand the renderers. Old briefs stay valid because the field is optional. - New source category. Edit
sources/sources.json(add the entry) and the category list inprompts/daily-cti-brief.mdPhase 1 (so a sub-agent picks it up). The site's category filter picks it up on the next build automatically. - New routine (e.g. monthly horizon scan). Add a prompt in
prompts/, create a new Claude Code routine pointing at it, and add a parallel workflow in.github/workflows/if you want CI to react to its output.
Anything more invasive (new state file, new repo layout) — write down the reasoning in the commit message and bump the prompt version with a CHANGELOG entry explaining the why before making the change. The agent's prompts are the load-bearing part of the system; small contract changes are easy to ship by accident and hard to roll back.