init: scaffold psyc — defensive CTI routing & evidence-sealing platform

Stage-1 vertical slice: Pydantic Case model, SQLAlchemy Core persistence, URLhaus Scoutline fetcher, FastAPI/Jinja cockpit (cases list + detail), flat Typer CLI, Result[T, E] type module, structlog config. Architecture in docs/dossier.md; 12-fold style guide in docs/style.md. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-14 12:43:47 +02:00
commit e04c6c96d8
30 changed files with 8271 additions and 0 deletions
--- a/docs/archive/waypoints_firstpass.md
+++ b/docs/archive/waypoints_firstpass.md
@@ -0,0 +1,221 @@
+# Review — API-Eligible Cyber Threat Reporting & Escalation Platforms (Draft v1)
+
+**Reviewer:** Claude (Opus 4.7, 1M context)
+**Review date:** 2026-05-13
+**Document reviewed:** `waypoints.md` (first draft)
+**Verdict:** Strong bones. Tone-perfect for white-hat defensive work — machine-to-machine, no vigilante framing. Publishable as an internal whitepaper after the critical fixes below.
+
+---
+
+## 1. What's Already Solid
+
+Don't change these — they're load-bearing and correct.
+
+- **Section 1.1 vs 1.2 split** (normal vs imminent harm) — exactly the right hinge for routing decisions.
+- **Section 8 (never-submit list)** — covers GDPR / exploitation amplification / credential leakage failure modes well.
+- **Section 9 normalized object** — the right abstraction. Transform-to-target instead of N bespoke pipelines.
+- **Section 10 architecture sentence** — the whole project on one line: *Sensors → OpenCTI → TheHive/IRIS → routing engine → MISP + abuse APIs + CERT/AIS → sanitized public.*
+
+---
+
+## 2. Critical Fixes (do these before this leaves draft)
+
+### 2.1 Geography mismatch — CISA AIS at #1 is US-only
+
+For European-focused work, **MISP via CIRCL.lu** (Luxembourg) or the **ENISA CSIRTs Network** is the workhorse. CISA AIS does not cover EU institutions.
+
+**Action:** Swap priorities #1 ↔ #2 (MISP first, AIS second). Add a row for **CERT-EU** specifically for European institutions.
+
+### 2.2 National CERTs are referenced generically but never named
+
+The doc says "National CERT/CSIRT" everywhere but never resolves it to an actionable receiver.
+
+**Action:** Add a small table after Section 1:
+
+| Country | Receiver                       | Channel                                |
+|---------|--------------------------------|----------------------------------------|
+| DE      | BSI / CERT-Bund                | reports@cert-bund.de, MISP community   |
+| FR      | ANSSI / CERT-FR                | TAXII feed                             |
+| UK      | NCSC-UK                        | structured email + early-warning service |
+| NL      | NCSC-NL                        | MISP                                   |
+| ES      | CCN-CERT, INCIBE-CERT          | MISP                                   |
+| EU      | CERT-EU, Europol EC3           | TLP-tagged MISP                        |
+
+The routing engine should pick the right one based on victim country.
+
+> **Note on Europol EC3:** they handle *criminal cases*, not first-call technical sharing. Route through your national CERT first; EC3 receives via national channels for cross-border coordination.
+
+### 2.3 Domain registrar abuse is missing from Section 1.3
+
+Cloudflare is covered, but registrars (Namecheap, Tucows, GoDaddy, EURid for `.eu`, DENIC for `.de`) are often the faster takedown path.
+
+**Action:** Add to the malicious-infrastructure flow:
+*registrar abuse contact from WHOIS → registrar abuse API/email → registry as escalation.*
+
+### 2.4 Severity scale `A|B|C|D|E` is unusual and undefined
+
+Either define it inline or replace with the standard `low|medium|high|critical` (CVSS-style) or NIS2 severity categories for EU consistency. Receivers will normalize anyway — but defining it lets the routing engine make automatic decisions.
+
+### 2.5 Normalized object missing an `actor` block
+
+You have `victim` but no `actor`. Add:
+
+```json
+"actor": {
+  "name": "Adira",
+  "aliases": [],
+  "campaign": "",
+  "confidence": "A1|A2|B1|B2|C2|C3|D|E|F"
+}
+```
+
+This field connects the doc to the project mission and lets the routing matrix differentiate actor-specific sightings from generic abuse reports.
+
+(`A1`–`F` is the Admiralty Code, the de-facto CTI standard. If that's too much, fall back to `low|medium|high`.)
+
+### 2.6 PII at submission time is a GDPR landmine
+
+Section 9 has `observables.emails: []`. Submitting victim email addresses to AbuseIPDB or VirusTotal is a personal-data transfer under GDPR.
+
+**Action:** Add a pre-submission sanitizer step that:
+
+- Hashes / redacts emails to `local-part-hash@domain` when destination is public
+- Strips PII from URLs (tokens, query params containing identifiers)
+- Keeps raw originals only in `evidence.raw_evidence_location` (internal-only storage)
+
+This belongs in the doc *before* the normalized-object section, not as an afterthought.
+
+---
+
+## 3. High-Value Additions
+
+### 3.1 TLP enforcement at the routing layer
+
+Nothing in the current schema *prevents* TLP:RED data being routed to a TLP:CLEAR destination.
+
+**Action:** Add a routing precondition: `submission.tlp <= destination.max_tlp_allowed`.
+
+- CISA AIS rejects TLP:RED
+- Cloudflare doesn't care
+- Spamhaus has its own rules
+- MISP communities each have their own ceiling
+
+Encode the ceiling per destination in the routing matrix.
+
+### 3.2 STIX 2.1 as the serialization
+
+Right now the doc implies *internal object → bespoke transform per API*. Cheaper and more standard:
+
+**internal object → STIX 2.1 bundle → minor adapter per destination**
+
+MISP, OpenCTI, CISA AIS, and most CTI tools are STIX-native. One serializer beats thirteen, and you get free interop with anything that already speaks STIX.
+
+### 3.3 Rate-limit budgets
+
+Many of these APIs have strict limits:
+
+- AbuseIPDB free tier: 1000 reports/day
+- VirusTotal public API: 4 req/min
+- Spamhaus: per-submitter quotas
+- Cloudflare: per-account rate limits
+
+Without a token-bucket per destination, high-confidence submissions get silently dropped during bursts.
+
+**Action:** Add a `destination_quota` field to the routing matrix and an enforcement layer.
+
+### 3.4 Feedback loop is missing
+
+When you submit to URLhaus, you can poll for status. When you submit to MISP, you get sightings. When you submit to Cloudflare, you get a case number. These should flow back into your OpenCTI graph as evidence-of-effectiveness.
+
+Without this, you're operating open-loop — you don't know which destinations actually act on your reports.
+
+**Action:** Add a Section 11 "Receipt and Effectiveness Tracking" that defines:
+
+- Per-destination receipt schema (case ID, ack timestamp, outcome status)
+- Polling cadence per destination
+- A success metric per destination type (takedowns confirmed, sightings count, classification adopted)
+
+### 3.5 NoMoreRansom (NMR)
+
+Ransomware.live is listed under monitoring, but if a decryptor research effort produces anything, NMR is the destination.
+
+**Action:** Add to the routing matrix:
+
+| Evidence type                  | First API destination          | Second destination   | Internal system        |
+|-------------------------------|--------------------------------|----------------------|------------------------|
+| Ransomware decryptor evidence | NoMoreRansom (private channel) | Victim CERT chain    | OpenCTI internal only  |
+
+NMR coordinates so victims can decrypt before the adversary sees the fix — *never* publish a working decryptor publicly first.
+
+---
+
+## 4. Nice-to-Have
+
+### 4.1 Submitter identity & signing
+
+- Register a stable submitter handle with MISP / MalwareBazaar / AbuseIPDB — not a personal account.
+- Sign internal objects with a project PGP key before they leave the system.
+- CIRCL and other major MISP communities weight trust by submitter history.
+
+### 4.2 Audit log requirement
+
+Every external submission writes an immutable row:
+
+```
+(timestamp, destination, payload_hash, submitter_identity, tlp, response_id, outcome)
+```
+
+Legal cover, debugging, and the feedback loop in 3.4 all need this.
+
+### 4.3 NIS2 callout for critical-infra reporting
+
+EU NIS2 mandates incident reporting from regulated entities within 24h of awareness. If detections involve essential/important entity sectors, the routing engine should flag NIS2 obligation regardless of receiver choice.
+
+### 4.4 Section ordering
+
+Sections 8 (data handling) and 9 (normalized object) are foundations, not appendices. Move them up to Sections 3–4. Currently a reader hits the platform list before knowing what *not* to send.
+
+### 4.5 Confidence convention
+
+`low|medium|high` is fine, but production CTI commonly uses the **Admiralty Code** (`A1`, `B2`, etc., describing source reliability × information credibility) or estimative language. Mention the convention even if you don't fully adopt it.
+
+---
+
+## 5. Implementation Notes (Blue48 Hookup)
+
+This doc is the spec for two components in the agent stack:
+
+1. **`report_writer` agent** outputs Section 9's normalized object as its canonical format.
+2. **A routing engine** (extension of `report_writer`, or a 7th agent) consumes that object, applies the matrix in Section 6, and fans out via API adapters.
+
+Agents stop at *"produce the normalized object."* Human review reads it, decides "yes, ship this to MISP and Cloudflare," and clicks. The routing engine then runs the API calls, captures receipts, and feeds them back to OpenCTI.
+
+### 5.1 Suggested initial adapters (Block G priority)
+
+1. MISP (PyMISP)
+2. AbuseIPDB
+3. URLhaus
+4. Cloudflare Abuse Reports
+5. urlscan.io
+
+These five cover ~80% of common evidence types in the routing matrix.
+
+### 5.2 Secrets handling
+
+Every adapter needs API credentials. They must:
+
+- Live in `.env` (already excluded from image via `.dockerignore`)
+- Be passed at container runtime via `env_file`, never baked into the image
+- Be rotatable on a schedule (the audit log in 4.2 helps prove non-overlap)
+
+---
+
+## 6. Summary
+
+| Category   | Count | Notes                                       |
+|------------|------:|---------------------------------------------|
+| Critical   |     6 | Geography, CERT mapping, registrar abuse, severity scale, actor block, PII sanitizer |
+| High-value |     5 | TLP enforcement, STIX 2.1, rate limits, feedback loop, NoMoreRansom |
+| Nice-to-have |   5 | Signing, audit log, NIS2, ordering, Admiralty Code |
+
+After the critical fixes, this is a publishable internal whitepaper and a clear spec for the routing engine. Good draft.