Files

m17hr1l e04c6c96d8 init: scaffold psyc — defensive CTI routing & evidence-sealing platform

Stage-1 vertical slice: Pydantic Case model, SQLAlchemy Core persistence,
URLhaus Scoutline fetcher, FastAPI/Jinja cockpit (cases list + detail),
flat Typer CLI, Result[T, E] type module, structlog config.
Architecture in docs/dossier.md; 12-fold style guide in docs/style.md.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

2026-05-14 12:43:47 +02:00

34 KiB

Raw Blame History

Detailed Review v2 — API-Eligible Cyber Threat Reporting & Escalation Platforms

Reviewer: Claude (Opus 4.7, 1M context) Review date: 2026-05-13 Document reviewed: waypoints.md (first draft) Companion to: waypoints_firstpass.md (v1 executive summary) Scope of this v2: section-by-section findings, cross-cutting gaps, missing categories, revised schema, implementation priorities for blue48.

0. Method

I re-read the draft three times against the following lenses:

Factual / API accuracy — does each platform actually do what's claimed?
Operational correctness — would the routing actually work in practice, or break on first contact with reality?
Legal / compliance — GDPR, NIS2, MLAT, jurisdiction, chain of custody
Threat-model coverage — does this serve the actual project goal (campaign disruption, not individual attribution)?
OPSEC of the reporter — what does the adversary learn from each submission?

Findings below carry confidence tags: [verified], [likely current], [verify before relying on].

1. Section-by-Section Findings

1.1 — Section 1: Recommended Reporting Order

1.1.1 In Scenario 1.1 (normal credible threat), going to the victim first is correct in 90% of cases — but flag the exception.

Insider-attack scenarios reverse this: notifying a victim org whose own admin/employee is the threat actor warns the attacker. For credential-leak cases involving privileged accounts, route CERT-first and let CERT decide whether to notify the victim org's leadership or its security contact. Add a 1.1.bis for "victim contact may itself be compromised."

1.1.2 Scenario 1.2 (imminent harm) is missing a specific decision point.

If the imminent harm is to critical infrastructure (energy, water, healthcare, finance), in EU jurisdictions the NIS2 Directive mandates 24-hour reporting from regulated entities. Your routing engine should detect "victim sector ∈ NIS2 essential/important entity list" and either:

Route the report so the victim can fulfill their NIS2 obligation, OR
(If victim is unreachable) report directly via the relevant national CERT's NIS2 channel, which exists separately from generic CSIRT contact paths

1.1.3 Scenario 1.3 missing receiver categories:

Hosting providers (not just CDNs). Cloudflare is a CDN; the actual origin server is somewhere else (Hetzner, OVH, AWS, DigitalOcean, etc.). A Cloudflare-only report leaves the origin running. Add hosting provider abuse as a parallel step, not after CDN.
Domain registrars via WHOIS-extracted abuse contact, plus registry escalation for ccTLDs (DENIC for .de, AFNIC for .fr, EURid for .eu, Nominet for .uk)
Certificate authorities for compromised cert revocation (Let's Encrypt revoke API for ACME-issued certs; commercial CA abuse contacts for the rest)
DNS providers independent of registrar (Cloudflare DNS, Quad9, Google Public DNS abuse contacts — for blocking, not takedown)

1.1.4 The implicit ordering bias.

The draft optimizes for legal-defensibility (talk to the receiver who can act) but doesn't optimize for operational speed-to-mitigation. For phishing kits with active credential harvesting, the fastest mitigation is often: parallel-fan-out to (CDN, hosting, registrar, browser-block-list providers) simultaneously, then notify CERT as record-keeping. The doc reads as serial when in practice it should be parallel.

1.2 — Section 2: Tier-1 API Reporting Platforms

1.2.1 Missing platforms that belong in Tier 1:

Platform	Why Tier-1	API style
abuse.ch ThreatFox	IOC graph, sibling to URLhaus/MalwareBazaar, accepts indicator submissions with kill-chain context	REST + Auth-Key
abuse.ch YARAify	YARA rule sharing + scanning. Direct fit since `detection_author` emits YARA	REST + Auth-Key
AlienVault OTX (now LevelBlue Labs OTX)	One of the largest free CTI communities. Pulses for sharing, pull API for consumption. Major omission from current draft.	REST + DirectConnect API
CIRCL Hashlookup	Fast hash reputation lookup, free, EU-hosted	REST
Shadowserver	Free network exposure / vulnerability scanning reports. Subscribe by ASN/CIDR/contact. The draft has it under "monitoring" but Shadowserver also accepts submissions and runs important takedown campaigns.	REST API

1.2.2 Reorder by jurisdictional fit:

The current #1 (CISA AIS) is US-government-tied. For Europe-focused work the right Tier-1 priorities are roughly:

MISP (CIRCL communities, plus ENISA CSIRTs Network communities)
OpenCTI (your own knowledge graph)
AlienVault OTX (broad reach, low friction)
CISA AIS (only if US-victim cases or US-relevant indicators)
Cloudflare / hosting abuse APIs
Spamhaus
URLhaus / MalwareBazaar / ThreatFox
AbuseIPDB
urlscan.io
Netcraft

1.2.3 Per-row corrections in the existing table:

CISA AIS — "STIX/TAXII bidirectional" — be specific: STIX 2.1 over TAXII 2.1, with the AIS Profile (a restricted subset of STIX). Submitting non-AIS-Profile STIX gets rejected. [verified]
Cloudflare Abuse Reports API — also requires noting that high-volume submitters can apply to be a Trusted Reporter which gets faster SLAs. [likely current]
VirusTotal API — public submissions are visible to all VT Premium customers (incl. potentially the adversary). The draft doesn't flag this — it's a critical OPSEC point. Use VT Private Scanning for sensitive samples. [verified]
PhishTank — community-vetted. As of late 2024 / early 2025 there were reports of reduced moderation activity. [verify before relying on]. Netcraft is the more reliable phishing-takedown channel today.
Google Web Risk — access truly is gated by Google customer engineering review; not a 5-minute API key signup. Apply early. [verified]

1.3 — Section 3: Per-Platform Notes

3.1 CISA AIS: Add: requires sponsorship from a federal agency or a signed AIS Sharing Agreement, plus the connector software (typically TAXII client). Onboarding measured in weeks, not days. The draft makes it sound like a sign-up form.

3.2 MISP: Missing:

ZeroMQ for real-time push (worth using if you want sub-second propagation to your own consumers)
Distinction between events (point-in-time intelligence) and feeds (continuous streams; better for IOC bulk delivery)
"Create a community" vs "Join a community" tradeoff — joining CIRCL's communities is the lowest-friction entry; creating your own is high-effort and pointless until you have multiple sharing partners
TLP-marking enforcement is not automatic at the MISP level — your client must respect TLP before publishing onward

3.3 OpenCTI: Missing:

The connector framework: ~80+ pre-built connectors (MITRE ATT&CK, MISP, CrowdStrike, Recorded Future, etc.) — most of your enrichment needs are already solved
The Workbench feature for analyst review before publishing
Filigran (the company behind OpenCTI) hosts a managed cloud version if you don't want to operate it yourself

3.4 Cloudflare Abuse Reports API: Missing:

API token requires Account.Abuse Reports permission — won't work with read-only tokens
Rate limits documented separately from the abuse API itself
For Cloudflare-hosted Workers (their serverless), abuse reports go to a different channel
Trusted Reporter program (mentioned above) — apply once you have submission history

3.5 Spamhaus: Missing the lists distinction:

DBL = Domain Block List (domains)
SBL = Spamhaus Block List (IPs)
XBL = Exploits Block List (exploit-sourced IPs)
ZRD = Zero Reputation Domains (newly registered)
Each list has different submission criteria. Wrong-list submissions get rejected. Your routing engine needs a list-selector.

3.6 AbuseIPDB: Missing:

The 23-category taxonomy (SSH brute force, port scan, web app attack, phishing, etc.) — your evidence type must map to an AbuseIPDB category code or the submission is low-utility
Free tier: 1000 reports/day, 100 IP checks/min. Paid tiers scale
Single-reporter submissions have low weight; reputation requires multiple corroborating submitters. Send to AbuseIPDB after sending to other corroborators

3.7 URLhaus: Missing:

Submission auth-key required (free, sign up)
Manual review for high-confidence flags
2024+ stricter format requirements
Linkage to MalwareBazaar — submit the URL to URLhaus, the sample to MalwareBazaar, link by hash

3.8 MalwareBazaar: Missing:

File size limits (~250MB last I checked)
Office macro / Windows installer formats need specific tags
Tag taxonomy is community-driven; non-canonical tags reduce utility
The "Avoid" line about legal-share is correct but vague. Specifically: do not upload samples obtained under NDA, samples from incidents where the victim hasn't consented, or samples that may contain victim PII (e.g., crafted payloads with the victim's name)

3.9 PhishTank: As noted above, declining. Verify status; consider deprioritizing.

3.10 urlscan.io: Missing:

Visibility settings: public, unlisted, private (private = paid)
Public scans are searchable by everyone — including the adversary monitoring for their kits being analyzed
The Search API is invaluable for retrohunts: "show me every scan in the last 30 days that loaded resource X"
Bulk submission via UUID-tagged customagent field for tracking your submission cohort

3.11 Google Web Risk: Missing:

GCP project + Web Risk API enabled prerequisite
Submissions evaluated by Google Safe Browsing pipeline; latency hours-to-days
Successful submissions show up in Chrome / Firefox / Safari Safe Browsing warnings — massive amplification. Use only for high-confidence URLs

3.12 VirusTotal: Missing:

Public API: 4 lookups/min, 500/day, 15.5k/month
Premium API: rate limits negotiated
File submission privacy: anyone with VT Intelligence can see your sample. Critical OPSEC point not in draft.
VT Private Scanning for sensitive samples
VT Hunting (YARA livehunt) for ongoing detection

3.13 Netcraft: Missing:

Strong takedown-execution record — Netcraft actually does the takedown work, not just reporting
Free tier exists for low-volume reporters
Strongest at brand-protection / phishing
They prefer evidence package format: source URL + screenshot + redirect chain + landing page HTML

1.4 — Section 4: Internal Case / Incident Routing Platforms

1.4.1 Missing platforms:

Platform	Best for	Why missing matters
Wazuh	Open-source SIEM with TheHive integration	Many SOCs use it; integrates cleanly with this stack
Microsoft Sentinel	Cloud SIEM with Logic Apps automation	Major enterprise platform — leaving it out makes the doc feel non-enterprise
Splunk SOAR (formerly Phantom)	Commercial SOAR	Major in enterprise SOCs
Cortex XSOAR	Commercial SOAR (Palo Alto)	Same
Shuffle	Open-source SOAR	Free alternative to XSOAR/Phantom
Tracecat	Newer open-source SOAR	Younger but actively developed
n8n	General workflow automation	Not security-specific but widely used as a glue layer

1.4.2 TheHive 5 vs 4: Be explicit — TheHive 4 reached EOL, TheHive 5 is current. Code examples should target TheHive 5 API.

1.5 — Section 5: Monitoring (Not Primary Reporting)

1.5.1 Missing high-value monitoring sources:

Source	What it gives you	API
AlienVault OTX	Largest free pulse community, IOC subscriptions	REST DirectConnect
CIRCL Passive DNS / Passive SSL	Historical DNS / cert lookups; EU-hosted	REST
PhishStats	Phishing URL stream	REST + RSS
DNSDumpster / SecurityTrails / BinaryEdge	Recon/asset-discovery DBs	REST (mostly paid for bulk)
GreyNoise	Benign-scanner classification — reduces false positives in IP reporting by tagging known internet-noise sources	REST
Spamhaus DNSBL queries	Free DNSBL lookups	DNS protocol
Maltrail	Open-source malicious-traffic detection feeds	Static feed download
CT log monitors (crt.sh, Censys CT)	New-cert issuance for your monitored domains — catches phishing-domain registrations	REST

1.5.2 GreyNoise specifically deserves a callout.

Reporting an IP that GreyNoise classifies as benign-scanner (Shodan, Censys, security researchers) gets you blacklisted from AbuseIPDB and embarrasses you with CERTs. Always GreyNoise-check before submitting an IP report. This is a one-line API call that prevents a class of bad submissions.

1.5.3 Shadowserver placement.

Currently in Section 5 (monitoring only) but Shadowserver also runs active sinkholing and takedown campaigns with global reach. They accept tip-offs and IOC contributions. Move them up to Tier 1 receivers, or at least call out the bidirectional relationship.

1.6 — Section 6: Practical Routing Matrix

1.6.1 Missing rows:

Evidence type	First	Second	Internal
Compromised TLS certificate	CT log monitor sighting → CA revocation request	Cloudflare/host if cert is in use	OpenCTI / TheHive
Mobile app malware	Google Play / Apple App Review submission	VirusTotal sample upload	OpenCTI
Cryptocurrency wallet (laundering)	Chainalysis / TRM (commercial) or on-chain analysis	OFAC SDN if sanctioned	Internal restricted case
Open-source supply-chain attack	Registry security (security@npmjs.com, security@python.org)	GitHub Security Lab	OpenCTI / TheHive
Compromised GitHub repo / leaked secret	GitHub Security Advisory + vendor-specific revoke API (e.g., AWS IAM)	Victim org	Internal restricted
Tor hidden service hosting malware	Document only (no takedown for .onion); push IOCs to MISP	n/a	OpenCTI
Sanctions-evasion crypto	OFAC SDN reporting (US) / EU FSF reporting	National FIU	Internal restricted
CSAM (legally separate)	NCMEC CyberTipline (US) / IWF (UK) / INHOPE (international)	National police	Stop processing immediately, preserve under legal hold
Phishing-resistant kit / 2FA bypass	Browser vendor reports (Chrome / Firefox / Safari Trust & Safety)	Affected service	OpenCTI

1.6.2 Cloudflare-proxied abuse needs a follow-up step.

Current row says: First → Cloudflare API; Second → Netcraft / PhishTank. Missing: Third → origin host abuse contact (extracted by sending Cloudflare a HEAD request that bypasses cache, or via certificate transparency cross-reference). Without this, takedown leaves the origin alive and the attacker just provisions a new CDN front-end.

1.6.3 The "Leaked credentials/API keys" row is dangerously thin.

"Victim first → CERT if severe → Internal IR case" — missing the revocation step, which is more time-critical than reporting. If you find a leaked AWS access key, the first action is aws iam delete-access-key via the affected account (with permission) or trigger AWS's automatic key-revocation by submitting to GitHub Secret Scanning. If you find leaked OAuth tokens for GitHub/Slack/etc., the relevant vendor has an automated revocation pathway. Add the revocation step before victim notification.

1.7 — Section 7: Minimum Viable API Stack

The current MVP list (10 items) is too heavy for "minimum viable." A genuine MVP for a new white-hat group is closer to:

OpenCTI — your knowledge graph (or, if too heavy, just MISP for both)
MISP via CIRCL community — free, EU-hosted, broad reach
AlienVault OTX — free, broadest reach for indicator sharing
AbuseIPDB — free tier, easy
URLhaus + MalwareBazaar + ThreatFox (the abuse.ch trio — same auth-key, three destinations)
urlscan.io — free tier, evidence generation
National CERT direct email + GPG — non-API, but mandatory

That's 7 things, of which 5 are pure free signups. Tackle Cloudflare/Netcraft/Spamhaus/GoogleWebRisk after you have throughput in those 7.

The current MVP includes TheHive — that's case management, not external reporting. Move it out of "API stack" since it's internal infrastructure.

1.8 — Section 8: Data Handling Rules

1.8.1 "Never submit publicly" — additions:

Insider-threat allegations without verification
Attribution claims about specific named individuals (the hard line we settled on earlier)
Government / classified material
PHI (US HIPAA scope)
PCI scope financial data
Children's data (COPPA US; GDPR Article 8 EU)
Biometric data
Trade secrets / source code
Material from unauthorized intrusion (even if you got to it via OSINT, "I downloaded their leaked DB" makes you a recipient of stolen goods in some jurisdictions)

1.8.2 "Safe to submit" — additions:

YARA rules (especially to YARAify)
Sigma rules (to SigmaHQ via PR)
Mutex names, named-pipe signatures (good Sysmon detections)
Persistence registry keys
Scheduled task names
TLS fingerprints (JA3, JA4)
HTTP user-agent strings observed in C2
ASN block ranges associated with adversary infrastructure
STIX/TAXII patterns
ATT&CK technique IDs (always)

1.8.3 Missing entire section: "Sanitize before submitting"

Strip URL query parameters that may contain victim tokens / session IDs
Hash email local-parts when target destination is public (a72b91…@example.com)
Redact internal hostnames from samples
Strip x-forwarded-for / source IP from log excerpts that name your honeypot
Replace victim-org names with role descriptors (<european_bank>) unless the submission is to a destination where the victim has consented or the receiver is trusted (CERT)

1.9 — Section 9: Recommended Submission Object

1.9.1 Schema gaps (additions in bold):

{
  "case_id": "WG-2026-000001",
  "schema_version": "1.0",
  "tlp": "AMBER",                            // use TLP 2.0 values: CLEAR/GREEN/AMBER/AMBER+STRICT/RED
  "tlp_marking_definition_ref": "marking-definition--...",  // STIX-compatible
  "severity": "low|medium|high|critical",   // replace A-E with standard
  "confidence": "low|medium|high",          // or Admiralty A1-F6
  "language": "en",                         // i18n
  "first_observed": "2026-05-13T10:00:00Z", // top-level
  "last_observed":  "2026-05-13T11:30:00Z",
  "valid_from":     "2026-05-13T10:00:00Z", // STIX-style validity window
  "valid_until":    "2026-08-13T10:00:00Z",
  "threat_type": "phishing|malware|ransomware|credential_exposure|iab|botnet|vulnerability_exploitation",

  "victim": {
    "organization": "",
    "domain": "",
    "country": "",
    "sector": "",
    "nis2_category": "essential|important|n/a",   // for EU NIS2 routing
    "consent_to_name_publicly": false             // sanitization gate
  },

  "actor": {
    "name": "Adira",
    "aliases": [],
    "campaign": "",
    "confidence": "A1|A2|...|F6"
  },

  "kill_chain": ["recon|weapon|deliver|exploit|install|c2|action"],
  "attack_techniques": ["T1566.001", "T1059.003"],

  "source": {
    "category": "forum|leak_site|telegram|honeypot|sensor|osint|tip",
    "first_seen": "",
    "last_seen": "",
    "collection_method": "lawful_osint_or_partner_feed",
    "burn_sensitivity": "low|medium|high"        // affects sanitization aggressiveness
  },

  "observables": {
    "ips": [],
    "domains": [],
    "urls": [],
    "hashes": [],
    "emails": [],
    "wallets": [],
    "cves": [],
    "yara_rules": [],
    "sigma_rules": [],
    "mutexes": [],
    "named_pipes": [],
    "scheduled_tasks": [],
    "registry_keys": [],
    "user_agents": [],
    "tls_fingerprints": [],                     // JA3/JA4
    "certificates": [],                         // CT log entries / SHA256 of cert
    "asn_blocks": [],
    "process_names": []
  },

  "pattern_relationships": [
    {"source": "domain:example.com", "type": "resolves_to", "target": "ipv4:1.2.3.4", "first_seen": "..."}
  ],

  "evidence": {
    "summary": "",
    "sanitized_screenshots": [],
    "raw_evidence_location": "internal_restricted_storage",
    "detonation_results": [],                   // sandbox report references
    "memory_artifacts": []                      // forensic, internal only
  },

  "timeline": [
    {"ts": "...", "event": "..."}
  ],

  "indicators_of_compromise": [],               // observables flagged as actively malicious

  "recommended_actions": [],

  "routing": {
    "primary_destinations": [],
    "secondary_destinations": [],
    "public_disclosure_allowed": false,
    "embargo_until": null,                      // timed disclosure
    "coordinated_with": []                      // who else has been told (CERT case IDs etc)
  },

  "audit": {
    "submitted_to": [],                         // append-only history of submissions
    "feedback_received": [],                    // ack IDs, takedown confirmations
    "submitter_identity": "wg-handle@misp",     // which submitter handle was used
    "signed_with": "PGP fingerprint",
    "object_sha256": ""                         // tamper-detect on the object itself
  }
}

1.9.2 Other schema concerns:

case_id format WG-2026-000001 is fine, but reserve a 2-char org prefix to avoid collision if you ever federate with another working group
tlp should use TLP 2.0 spec values (CLEAR, GREEN, AMBER, AMBER+STRICT, RED) — TLP 1.0 used different terms
Severity / confidence mismatch in v1: severity used A-E, confidence used words. Standardize.
Add a per-object hash so the routing engine can detect tampering between produce-time and submit-time

1.10 — Section 10: Final Recommendation

1.10.1 The architecture sentence is missing the feedback edge.

Current: Sensors → OpenCTI → TheHive/IRIS → routing engine → MISP + abuse APIs + CERT/AIS → sanitized public reporting

Better: Sensors → OpenCTI → TheHive/IRIS → routing engine → MISP + abuse APIs + CERT/AIS → sanitized public reporting → receipts and outcomes back to OpenCTI → effectiveness scoring → re-prioritization

Without the feedback edge, you can't tell which destinations are worth maintaining.

1.10.2 Missing entirely: closing checklist for "we're ready to submit."

A final checklist before any external submission fires:

[ ] TLP enforced (object.tlp <= destination.max_tlp)
[ ] Sanitization pass complete (PII stripped per destination policy)
[ ] GreyNoise check (if observables include IPs)
[ ] Quota available (rate-limit budget not exceeded)
[ ] Submitter identity registered with destination
[ ] Object signed
[ ] Audit row written
[ ] Human approver clicked yes (for non-automated tier)

This belongs as Section 11 or as the closing block of Section 10.

2. Cross-Cutting Gaps (Not Tied to Any Section)

2.1 — OPSEC for the Reporters Themselves

Not in the doc at all. If your group is reporting Adira to authorities, Adira may notice — they read MISP communities (those that are open), they read URLhaus (public), and they have visibility into VirusTotal Premium (paid customer).

Required additions:

Submission identity registry: which handle is used on which platform, who has access, rotation schedule
Account-creation OPSEC: don't use personal accounts on submission platforms; create a project handle, use a project email, register with project-owned phone/2FA
Network OPSEC for collection: if you're scraping leak sites or monitoring the adversary's infrastructure, route through a VPN or research-purpose proxy — never the same network as your submission identity
PGP for CERT comms: every national CERT publishes a PGP key. Every email submission to a CERT should be signed and encrypted. Untouched in the draft.

2.2 — Burnt Source Protection

If you have a private collection source (honeypot, infiltrated channel, tipped-off insider), publishing IOCs from it can burn the source. Specifically:

A unique honeypot fingerprint (banner, response timing, listening port) lets the adversary identify which sample came from your honeypot
Publishing a sample with a unique build artifact (your sandbox's hostname in a DNS query, a timestamp matching your detonation window) reveals your detonation infrastructure
Reporting a forum URL while it's still live tips off the forum operator that it's being watched

The doc needs a burn-sensitivity tier on each observable, and a sanitization step that aggressively scrubs source-identifying artifacts before any external submission.

2.3 — Adversary Observability of Your Submissions

Tier each receiver by who can see your submission:

Receiver	Adversary visibility
MISP private community	trusted community only
MISP public community / OTX public pulse	anyone with an account
URLhaus	public — adversary can monitor
MalwareBazaar	public — adversary can detect their sample was uploaded
VirusTotal public submission	every VT Premium customer (incl. potentially adversary)
VT Private Scanning	only your team (paid)
AbuseIPDB	public reputation visible
Cloudflare Abuse Reports	only Cloudflare and the reported asset owner
CERT direct (GPG-encrypted)	only the CERT

The routing engine should display this visibility for each destination during human review.

2.4 — Chain of Custody / Legal Admissibility

If any of this material may end up in a criminal proceeding, chain of custody matters. Specifically:

The raw evidence must be preserved unmodified, with hashes recorded at acquisition time
Any transformation (sanitization, normalization) must be reversible — the routing engine logs the input hash, the transform applied, and the output hash
The submitter identity for each external submission is logged
Witnesses (multi-party access logs) are preferred for high-value evidence

The current evidence.raw_evidence_location field is a placeholder; it needs structure: storage path, hash, acquisition timestamp, acquirer identity.

2.5 — Amplification Risk

Publishing IOCs publicly amplifies awareness — which is good for defenders but bad if:

The IOC includes a compromised legitimate site (you damage the site owner's reputation)
The IOC is for a piece of infrastructure that's about to be used in a sting operation by LE
The IOC reveals an investigation technique still under embargo

A publish-readiness review belongs in Section 1 of the doc, not in the closing checklist.

2.6 — Failure Modes / Retries

What happens when:

URLhaus rejects a submission (malformed, low-confidence flag, duplicate)?
MISP is down for maintenance?
Cloudflare returns 503?
Your submitter identity gets rate-limited?
An API token is revoked mid-batch?

The doc has no resilience layer. Recommend:

Idempotent submission with client-generated IDs (so retries don't double-submit)
Per-destination retry policy (exponential backoff with jitter)
Dead-letter queue for permanent failures — surface in human-review UI
Per-submitter quota tracking, with auto-failover to backup submitter if available

2.7 — Versioning and Maintenance

The doc has no version number, no changelog, no maintainer field, no review cadence. For a living spec like this:

---
schema_version: 1.0
last_reviewed: 2026-05-13
next_review_due: 2026-08-13
maintainer: <project lead>
changelog:
  - 2026-05-13: initial draft
---

API surfaces of these platforms change (Cloudflare deprecations, VT pricing changes, abuse.ch tag taxonomy updates). A quarterly re-validation cadence is sane.

2.8 — Multi-Language Submissions

Many national CERTs prefer or require local language for narrative fields (BSI German, ANSSI French, CCN-CERT Spanish). The submission object's language field (added above) plus a translation step in the routing engine handles this. Currently absent.

3. Missing Categories Entirely

3.1 — Hosting Provider Abuse Channels (most lack true REST APIs)

Provider	Channel	API?
AWS	abuse@amazonaws.com + form	No public REST; AWS responds to email
Google Cloud	https://support.google.com/cloud/answer/2417620	Form-only
Azure	https://msrc.microsoft.com/report/abuse	Form + email
DigitalOcean	abuse@digitalocean.com	Email + status REST
Hetzner	abuse@hetzner.com + form	Form
OVH	abuse@ovh.net + Anti-abuse REST API	Yes
Linode (Akamai)	abuse@linode.com	Email
Vultr	abuse@vultr.com	Email

Treat email-based providers as a different submission class (template + GPG-signed email, with parsed-receipt detection). Worth a Section 11 in the doc.

3.2 — Cryptocurrency / Sanctions

Chainalysis Reactor — commercial, gold standard for on-chain investigations
TRM Labs — commercial alternative
CipherTrace (Mastercard) — commercial
OFAC SDN reporting — for US-sanctioned wallets
EU Financial Sanctions Files (FSF) — for EU sanctions
National FIUs — Financial Intelligence Units, country-specific
Free / open: GraphSense (open-source on-chain analytics), Etherscan (manual)

3.3 — Mobile / App Store

Google Play Protect submissions (for Android malware)
Apple App Review report (for malicious iOS apps)
APKMirror reports (for repackaged apps)
F-Droid security contacts (for compromised FOSS apps)

3.4 — Open-Source Supply Chain

PyPI: security@python.org
npm: security@npmjs.com + vendored auto-revoke for leaked tokens
crates.io: help@crates.io
RubyGems: security@rubygems.org
Maven Central: central@sonatype.org
GitHub Security Lab (research collaboration)
OpenSSF Vulnerability Disclosure (cross-ecosystem coordination)
Sigstore (provenance verification, longer-term)

3.5 — Certificate Authorities

Let's Encrypt: ACME revocation API for ACME-issued certs
Sectigo / DigiCert / GlobalSign / Entrust: abuse contacts in CA/Browser Forum compliance docs
CT log monitors for detection (crt.sh, Censys CT, Google CT)

3.6 — Tor / Dark Web

Limited takedown leverage for .onion services, but worth documenting:

Document via Tor Project's abuse handling page (limited leverage)
Contribute IOCs to DarkOwl, Recorded Future, Flashpoint (commercial dark-web monitoring) if you have access
Push to MISP with tor tag for community awareness

3.7 — CSAM (Legally Separate Pathway)

If CSAM is encountered during collection, stop processing immediately. CSAM has separate legal handling rules:

NCMEC CyberTipline (US)
IWF (Internet Watch Foundation) (UK)
INHOPE (international hotline network)
Possessing CSAM is illegal even for research; do not attempt to verify, document, or share. Report and delete from your systems under documented legal hold.

This deserves a Section 12 with a hard stop: "if encountered, halt and report via the channels below; do not include in any other submission flow."

4. Missing Platforms Worth Adding (Quick List)

Free / Open

AlienVault OTX (huge omission)
ThreatFox
YARAify
CIRCL Hashlookup
CIRCL Passive DNS / Passive SSL
Maltrail feeds
crt.sh / Censys CT
GreyNoise community tier
Spamhaus DNSBL queries
PhishStats

Commercial / Paid (worth listing for completeness)

Recorded Future
Mandiant Advantage (now Google Threat Intelligence)
CrowdStrike Falcon Intelligence
Sekoia.io
Flashpoint
DomainTools (passive DNS / WHOIS history)
RiskIQ (now Microsoft Defender Threat Intelligence)
Anomali ThreatStream

Intelligence Communities (membership-based)

FIRST.org (CSIRT global community)
Trusted Introducer (European CSIRT trust framework)
M3AAWG (Messaging, Malware, Mobile Anti-Abuse Working Group)
APWG (Anti-Phishing Working Group)
Cyber Threat Alliance (commercial CTI sharing)
ENISA CSIRTs Network

5. Implementation Priorities for Blue48

In our agent stack, this doc translates to concrete work:

5.1 — Block G additions (when we get there)

report_writer agent outputs the v2 normalized object (Section 1.9.1 above) as canonical format
New routing_engine component (extension of report_writer, or a 7th agent) — consumes the object, applies routing matrix, fans out via API adapters
Adapter priority order for blue48 v1.0:
1. MISP (PyMISP)
2. AlienVault OTX (REST)
3. AbuseIPDB (REST + category mapping)
4. URLhaus + MalwareBazaar + ThreatFox (shared abuse.ch auth-key)
5. urlscan.io (REST, with private-by-default visibility)
6. Cloudflare Abuse Reports
7. GPG-signed email to BSI / CERT-Bund (since the user is in DE)

5.2 — Schema work

config/submission_schema.json — JSON Schema for the v2 normalized object
config/routing_matrix.yaml — declarative rules: evidence type → destinations, with TLP ceilings and quotas
core/sanitize.py — pre-submission scrubbing per destination policy
core/audit.py — append-only log of every submission, signed
core/tlp.py — TLP 2.0 enforcement

5.3 — Pre-submission gates (before any adapter fires)

1. Schema valid?
2. TLP <= destination ceiling?
3. Sanitization complete?
4. GreyNoise check passes (for IPs)?
5. Quota available?
6. Submitter identity registered with destination?
7. Object signed?
8. Audit row written?
9. Human approver yes (for non-auto tier)?

If any fail → drop into human-review queue with the reason. Never silently skip.

5.4 — Failure / retry layer

Per-destination idempotency keys (client-generated)
Exponential backoff with jitter
Dead-letter queue for permanent failures, surfaced in data/dlq/
Per-submitter quota tracking with auto-failover

6. Summary of v2 Findings

Category	Count	Action
Section-by-section corrections	38	Fold into the draft
New cross-cutting sections needed	8	Add as Sections 11–18
Missing platform categories	7	Each warrants a sub-section
Missing free/open platforms (Tier 1)	5	Add to Section 2
Schema field gaps	17	Adopt v2 schema above
Pre-submission gates not defined	9	Add as closing checklist

After folding these in, the document becomes a publishable internal whitepaper and a complete spec for the blue48 routing engine. The first draft was a confident outline; the v2 turns it into a working manual.

If useful, I can next:

(a) Generate config/submission_schema.json (JSON Schema for the v2 normalized object) into ~/blue48/config/
(b) Generate config/routing_matrix.yaml (declarative routing rules) into ~/blue48/config/
(c) Scaffold agents/routing_engine.py with adapter stubs for the seven Block-G priority destinations
(d) Re-issue this review as suggested edits inline against the original (so you can accept/reject diff-style)

Pick any subset and I'll ship.

34 KiB Raw Blame History Unescape Escape