Files

m17hr1l e04c6c96d8 init: scaffold psyc — defensive CTI routing & evidence-sealing platform

Stage-1 vertical slice: Pydantic Case model, SQLAlchemy Core persistence,
URLhaus Scoutline fetcher, FastAPI/Jinja cockpit (cases list + detail),
flat Typer CLI, Result[T, E] type module, structlog config.
Architecture in docs/dossier.md; 12-fold style guide in docs/style.md.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

2026-05-14 12:43:47 +02:00

120 KiB

Raw Blame History

Blue48 / Adira Hunt — Consolidated Dossier

Compiled: 2026-05-13

Auto-merged from the individual source files in this directory. Each source file remains authoritative; this is for single-pane reading.

Architecture sentence

Sensors
→ Scoutline
→ Proofline
→ Mapline
→ Classifyline
→ Sealine
→ Routeline
→ Ledgerline
→ Publishline
→ Trainline
→ Blue48 Operations Cockpit

Core principle

Validate the signal, protect the evidence, route only what each destination is authorized to receive, and prove every external action through an immutable ledger.

Blue48 Reporting and API Escalation Architecture v2 (from routeline.md)
Blue48 Worker Mesh Architecture (from hivemap.md)
Blue48 IntelMiner and LoRA Training Data Pipeline (from intelminer.md)
Blue48 Operations Cockpit — GUI / UI-UX Concept (from blue48_operations_cockpit_ui_ux.md)
API-Eligible Cyber Threat Reporting & Escalation Platforms (from waypoints.md)
Review — API-Eligible Cyber Threat Reporting & Escalation Platforms (Draft v1) (from waypoints_firstpass.md)
Detailed Review v2 — API-Eligible Cyber Threat Reporting & Escalation Platforms (from waypoints_scalpel.md)

Blue48 Reporting and API Escalation Architecture v2

Source: routeline.md

Document type: Project record / operational architecture
Scope: API-eligible reporting platforms, routing order, evidence handling, CERT mapping, abuse routing, receipts, and audit controls
Status: Draft v2

1. Purpose

This document defines how Blue48 routes defensive cyber-intelligence to the correct recipients using structured APIs, trusted communities, CERT/CSIRT channels, abuse-reporting endpoints, and authority-sealed evidence packages.

The platform is designed for lawful white-hat operations. It should not amplify stolen data, expose victims prematurely, or interact with criminal actors.

Core principle:

Validate the signal, protect the evidence, route only what each destination is authorized to receive, and prove every external action through an immutable ledger.

2. Recommended Reporting Order

Normal cases

Victim Security Team
→ National CERT / CSIRT
→ Sector ISAC / trusted community
→ Law enforcement cyber unit when criminal evidence exists
→ Provider / registrar / abuse APIs
→ Public sanitized report after mitigation

Imminent harm or critical infrastructure

National CERT / CSIRT
→ Victim Security Team
→ Law enforcement cyber unit
→ Sector ISAC / regulator
→ Provider / registrar / abuse APIs
→ Trusted CTI community
→ Public sanitized report after mitigation or clearance

Malicious infrastructure

Hosting provider / CDN / cloud abuse desk
→ Registrar abuse contact
→ Registry escalation
→ National CERT / CSIRT
→ Law enforcement when warranted
→ Trusted CTI community

Mass exploitation

Affected vendor
→ National CERT / CSIRT
→ Affected sectors / ISACs
→ MISP / trusted CTI community
→ Public advisory after coordinated mitigation

3. Authority-Sealed Evidence Handling

Blue48 does not treat full evidence protection as “sanitization.” Sensitive evidence should be preserved, encrypted, and routed only to authorized recipients.

Use the term:

Authority-Sealed Evidence Handling

Purpose:

preserve high-value evidence
prevent uncontrolled internal access
prevent accidental redistribution
allow victims or authorities to decrypt when authorized
destroy local plaintext and unwrapped keys after successful sealing

4. Evidence Protection Models

Model A: Authority public-key encryption

Authorities or victims provide public encryption keys.

Evidence collected
→ sensitive evidence packaged
→ package encrypted with authority/victim public key
→ encrypted package submitted
→ local plaintext destroyed
→ only recipient can decrypt

This is the cleanest model because Blue48 never holds the recipient private key.

Model B: One-time evidence key wrapped for recipients

Generate random evidence key
→ encrypt evidence with evidence key
→ wrap evidence key to recipient public keys
→ submit encrypted package
→ destroy local plaintext
→ destroy local unwrapped evidence key

Package example:

{
  "evidence_package_id": "uuid",
  "encrypted_evidence": "ciphertext-reference",
  "wrapped_keys": [
    {
      "recipient": "CERT-Bund",
      "key_id": "authority-key-id",
      "wrapped_key": "encrypted-evidence-key"
    },
    {
      "recipient": "Victim Security Team",
      "key_id": "victim-key-id",
      "wrapped_key": "encrypted-evidence-key"
    }
  ],
  "metadata": {
    "tlp": "AMBER",
    "severity": "critical",
    "created_at": "2026-05-13T00:00:00Z",
    "retention_policy": "plaintext destroyed after encryption"
  }
}

5. Destination Minimization

Authority-sealed evidence handling does not mean every platform receives full evidence. Public and semi-public APIs should receive only the minimum necessary payload.

Destination type	Payload
CERT / law enforcement	Encrypted full evidence package when authorized.
Victim security team	Encrypted full or partial evidence package.
Trusted MISP community	TLP-filtered STIX indicators and context.
Provider / registrar abuse API	Minimal abuse report with infrastructure evidence.
URLhaus / MalwareBazaar	Malware URL/hash/sample only when legally allowed.
AbuseIPDB	IP, category, timestamp, short comment.
VirusTotal	Hash, URL, or sample only when policy allows.
Public report	Sanitized narrative, no raw sensitive evidence.

6. Priority Platform Order

For European or global operations, use MISP and national CERT routing before US-specific AIS.

Priority	Platform / route	Role
1	MISP / CIRCL / trusted MISP communities	Primary CTI sharing backbone.
2	National CERT / CSIRT	Country-specific authority route.
3	CERT-EU / ENISA CSIRTs Network	EU institutions and European coordination.
4	CISA AIS	US-relevant machine-to-machine indicator sharing.
5	OpenCTI	Internal graph and knowledge base, not necessarily an external reporting destination.
6	Provider / CDN / cloud abuse APIs	Infrastructure mitigation and takedown.
7	Registrar / registry abuse channels	Domain suspension or escalation.
8	abuse.ch, URLhaus, MalwareBazaar	Malware URL/sample ecosystem reporting.
9	AbuseIPDB / Spamhaus / PhishTank / urlscan.io	Public/semi-public abuse and phishing ecosystem reporting.
10	Public advisory channels	Sanitized reporting after mitigation.

7. CERT / CSIRT Routing Map

The routing engine should pick receivers based on victim country, sector, and legal jurisdiction.

Country / region	Receiver	Channel type
Germany	BSI / CERT-Bund	Structured email, trusted channels, MISP community where available.
France	ANSSI / CERT-FR	CERT channel, structured reporting, TAXII/MISP where available.
United Kingdom	NCSC-UK	Structured reporting, early-warning services, official channels.
Netherlands	NCSC-NL	CERT channel, trusted community/MISP where available.
Spain	CCN-CERT / INCIBE-CERT	Public-sector/private-sector split, CERT channels, MISP where available.
EU institutions	CERT-EU	EU institutional route.
EU criminal coordination	Europol EC3	Usually via national CERT/law-enforcement channels, not first-call technical sharing.
United States	CISA / FBI IC3 / FBI field office	CISA for technical reporting, IC3/FBI for crime reporting.

Implementation note:

Europol EC3 should not be treated as the first technical receiver. Route through the relevant national CERT or law-enforcement channel first unless a formal coordination channel exists.

8. Severity and Class Mapping

Blue48 may keep an internal class model, but outbound reports should include standard severity.

Internal class	Meaning	External severity
A	Imminent harm / attack likely underway	Critical
B	Credible planned attack	High
C	Confirmed exposure	High / Medium
D	Campaign intelligence	Medium / High
E	Weak signal / watchlist	Low / Monitor

9. Normalized Case Object

All workers should read and write the same normalized case object.

{
  "case_id": "B48-2026-000001",
  "summary": "Short defensive summary",
  "classification": {
    "class": "A | B | C | D | E",
    "severity": "low | medium | high | critical",
    "tlp": "RED | AMBER | GREEN | CLEAR",
    "incident_type": "access_sale | ransomware | credential_leak | phishing | malware | exploit | botnet | data_leak"
  },
  "confidence": {
    "level": "low | medium | high",
    "source_reliability": "A | B | C | D | E | F | unknown",
    "information_credibility": "1 | 2 | 3 | 4 | 5 | 6 | unknown"
  },
  "victim": {
    "name": "",
    "domain": "",
    "country": "",
    "sector": "",
    "critical_infrastructure": false
  },
  "actor": {
    "name": "",
    "aliases": [],
    "campaign": "",
    "confidence": "low | medium | high"
  },
  "observables": {
    "domains": [],
    "ips": [],
    "urls": [],
    "hashes": [],
    "cves": [],
    "wallets": [],
    "emails": []
  },
  "evidence": {
    "raw_evidence_location": "internal-only-reference",
    "sealed_package_id": "",
    "payload_hash": "",
    "plaintext_destroyed": false,
    "local_unwrapped_key_destroyed": false
  },
  "routing": {
    "recommended_routes": [],
    "blocked_routes": [],
    "human_approval_required": true
  }
}

10. TLP Enforcement

Every destination must define a maximum allowed TLP.

Routing precondition:

submission.tlp <= destination.max_tlp_allowed

Example destination policy:

Destination	Max TLP	Payload type
CERT / CSIRT trusted route	RED or AMBER depending channel	Sealed evidence package.
Victim security team	RED or AMBER depending identity verification	Sealed package or controlled extract.
MISP trusted community	AMBER or GREEN depending sharing group	STIX/MISP event.
Public MISP community	GREEN or CLEAR	Public-safe indicators.
AbuseIPDB	CLEAR	Minimal IP abuse report.
URLhaus	CLEAR / GREEN depending policy	Malicious URL report.
VirusTotal	CLEAR only unless legal approval exists	Hash/URL/sample where permitted.
Public advisory	CLEAR	Sanitized intelligence.

11. API-Eligible Destination Categories

Platform	Purpose	Integration style
MISP	Threat intelligence sharing and communities.	REST API / PyMISP / STIX.
OpenCTI	Internal CTI graph and knowledge management.	GraphQL API / STIX.
CISA AIS	US-relevant automated indicator sharing.	TAXII/STIX-style exchange.

11.2 Abuse and takedown

Platform	Purpose	Integration style
Cloudflare Abuse Reports	Report abuse behind Cloudflare services.	Abuse Reports API / portal.
Registrar abuse channels	Domain abuse escalation.	API where available, otherwise structured email/WHOIS abuse contact.
Registry escalation	Escalation for TLD-level issues.	Registry-specific process.
URLhaus	Malware URL reporting.	API submission.
MalwareBazaar	Malware sample/hash ecosystem.	API submission/query.
AbuseIPDB	IP reputation and abuse reports.	API.
Spamhaus	Spam, botnet, and malicious infrastructure reporting.	Submission portal/API where available.
PhishTank	Phishing URL reporting.	API/community workflow.
urlscan.io	URL scan and malicious page evidence.	API submission.
Google Web Risk	Unsafe URL submission where permitted.	Restricted API.
VirusTotal	URL/file/hash enrichment and submission.	API, policy controlled.
Netcraft	Phishing and abuse reporting.	API/enterprise options and reporting channels.

11.3 Internal case-management

Platform	Purpose
TheHive	Security case management and observables.
DFIR-IRIS	Incident response case management.
ServiceNow SIR	Enterprise incident response workflow.
Jira Service Management	Case routing and task management.

12. Registrar and Registry Abuse Flow

For malicious domains, phishing portals, C2 domains, impersonation infrastructure, and ransomware-related web infrastructure:

Identify domain
→ identify hosting provider and CDN/proxy
→ identify registrar from WHOIS/RDAP
→ report to hosting/CDN abuse
→ report to registrar abuse
→ escalate to registry if registrar fails or emergency applies
→ notify CERT if critical or cross-border

Payload should include:

domain
abuse type
timestamp
evidence hash
screenshot hash or sealed evidence reference
safe reproduction summary
victim impersonated, if relevant
requested action

Avoid sending raw credentials, stolen data, or private victim details to registrar/registry channels unless legally justified.

13. Rate Limits and Queueing

Every destination should define a quota object.

{
  "destination": "AbuseIPDB",
  "quota": {
    "limit": 1000,
    "period": "day",
    "priority_policy": "critical_first",
    "backoff": "exponential"
  }
}

RateLimiter responsibilities:

prevent dropped submissions
queue low-priority submissions during bursts
reserve budget for critical cases
retry transient failures
record rate-limit errors in Ledger

14. Receipt and Effectiveness Tracking

ReceiptCollector and Watcher should capture feedback from every destination.

Receipt schema:

{
  "case_id": "B48-2026-000001",
  "destination": "Cloudflare Abuse API",
  "submitted_at": "2026-05-13T00:00:00Z",
  "acknowledged_at": "2026-05-13T00:05:00Z",
  "receipt_id": "external-case-id",
  "status": "submitted | acknowledged | rejected | actioned | closed",
  "outcome": "pending | takedown_confirmed | duplicate | no_action | escalated"
}

Success metrics:

Destination type	Success metric
CERT / CSIRT	Acknowledgement, case opened, mitigation guidance issued.
Provider / registrar	Infrastructure suspended, blocked, or investigated.
MISP	Event accepted, sightings, correlations.
URLhaus / MalwareBazaar	URL/sample accepted, classified, distributed.
Public report	Defenders consume advisory, no sensitive data leak.

15. Immutable Audit Log

Every external submission or destructive action must write an immutable record.

Audit row:

(timestamp, case_id, destination, payload_hash, submitter_identity, tlp, response_id, outcome)

Also audit:

evidence sealing
recipient key addition/removal
plaintext destruction
local key destruction
route approval
route blocking
public publication
dataset export
policy modification

16. Public Reporting Rules

Public reports may include:

sector trend
country/region if safe
actor or campaign if already public or properly attributed
TTPs
CVEs
defensive recommendations
sanitized IOCs
non-sensitive timeline

Public reports must not include:

raw credentials
stolen data
direct links to stolen data
live access details
internal screenshots
private victim communications
exact criminal-source links
exploit instructions
anything that increases victim harm before mitigation

17. Initial Adapter Build Order

Recommended Block G adapter priority:

MISP via PyMISP
AbuseIPDB
URLhaus
Cloudflare Abuse Reports
urlscan.io
MalwareBazaar
Registrar abuse structured email/RDAP helper
VirusTotal enrichment/submission with strict policy guard
OpenCTI internal graph integration
TheHive or DFIR-IRIS case export

These cover the most common evidence and routing cases while keeping legal risk manageable.

18. Secrets Handling

Every adapter needs credentials.

Rules:

credentials live in .env or secret manager
credentials are injected at runtime
credentials are never baked into container images
credentials are rotatable
credentials are scoped per adapter
every API call writes to the Ledger
failed authentication events are logged and alerted

19. Summary

The v2 architecture changes the platform from a list of reporting sites into an operational routing system.

The most important revisions are:

MISP and national CERTs are prioritized over CISA AIS for European/global work.
CERT routing is country-specific.
Registrar and registry abuse flows are included.
Sensitive evidence is protected through authority-sealed encryption, not casual sanitization.
Public and semi-public APIs receive minimized payloads only.
TLP enforcement, rate limits, receipts, and immutable audit logs are mandatory.

Blue48 Worker Mesh Architecture

Source: hivemap.md

Document type: Project record / technical architecture
Scope: Worker names, responsibilities, interfaces, data flow, human review boundaries
Status: Draft v1

1. Purpose

Blue48 should not rely on one large, expensive, opaque model to perform all cyber-intelligence operations. The platform should be built as a mesh of small, specialized workers.

Each worker performs one narrow function, writes structured output, and passes a normalized case object to the next stage. Heavy models are reserved for judgment-heavy tasks such as confidence scoring, routing explanations, public report drafting, and training-example generation.

Core principle:

Small workers produce traceable outputs. Humans approve sensitive decisions. The Ledger proves what happened.

2. High-Level Flow

Scoutline
→ Proofline
→ Mapline
→ Classifyline
→ Sealine
→ Routeline
→ Ledgerline
→ Publishline
→ Trainline

Operator version:

Detect → Validate → Map → Classify → Seal Evidence → Route → Submit → Track → Archive → Learn

3. Worker Lines

Line	Purpose
Scoutline	Finds, fetches, parses, and deduplicates lawful intelligence sources.
Proofline	Validates claims, checks indicators, measures freshness, and scores confidence.
Mapline	Resolves victims, actors, sectors, jurisdictions, CERT routes, and affected products.
Classifyline	Assigns severity, TLP, incident type, and operational class.
Sealine	Packages evidence, encrypts it for authorized recipients, and destroys local plaintext/key material when policy allows.
Routeline	Selects destinations, builds payloads, enforces destination policy, and submits reports.
Ledgerline	Records immutable audit events, receipts, outcomes, and follow-up status.
Publishline	Produces sanitized public intelligence only after mitigation and approval.
Trainline	Converts lawful, reviewed intelligence into LoRA-ready training data.

4. Core Worker Set

The first conceptual worker set is:

Scout → Verifier → Mapper → Classifier → Sealer → Router → Courier → Ledger

Support workers:

Watcher → Archivist → Publisher

Operational sentence:

Scout detects.
Verifier confirms.
Mapper identifies.
Classifier prioritizes.
Sealer protects.
Router decides.
Courier submits.
Ledger proves.
Watcher follows up.
Archivist forgets safely.
Publisher informs.

5. Granular Worker Breakdown

5.1 Scoutline

Worker	Job	Model requirement
SourcePlanner	Maintains the approved source list, collection schedules, and source eligibility.	None / rules
Crawler	Discovers new pages, feeds, advisories, reports, APIs, and datasets.	None
Fetcher	Downloads pages, PDFs, JSON, RSS, STIX/TAXII, MISP events, and API responses.	None
Parser	Extracts title, date, author, body, tables, indicators, and metadata.	Rules / small model
Deduper	Detects duplicate reports, reposted IOCs, syndicated articles, and repeated claims.	Embeddings / rules
SourceRanker	Scores the source based on trust, history, origin, and license status.	Rules / small model
Signalizer	Converts parsed content into candidate intelligence signals.	Small/medium model

Output:

{
  "signal_id": "uuid",
  "source_type": "advisory | cti_report | abuse_feed | ransomware_monitor | public_blog | misp_event",
  "summary": "short defensive summary",
  "observed_at": "2026-05-13T00:00:00Z",
  "raw_evidence_location": "internal-only-reference"
}

5.2 Proofline

Worker	Job
Correlator	Checks whether the same signal appears across multiple independent sources.
IOCChecker	Validates domains, IPs, hashes, URLs, wallet addresses, emails, and CVEs.
FreshnessChecker	Determines whether the signal is current, stale, repeated, or resurfaced.
ClaimChecker	Labels language as confirmed, claimed, observed, rumored, or speculative.
ConfidenceScorer	Produces final confidence and optional Admiralty Code values.

Output:

{
  "confidence": "low | medium | high",
  "source_reliability": "A | B | C | D | E | F | unknown",
  "information_credibility": "1 | 2 | 3 | 4 | 5 | 6 | unknown",
  "claim_status": "confirmed | claimed | observed | rumored | speculative",
  "freshness": "new | recent | stale | resurfaced"
}

5.3 Mapline

Worker	Job
EntityResolver	Maps organization names, domains, subsidiaries, brands, and aliases.
GeoResolver	Maps victim country, jurisdiction, national CERT, and cross-border implications.
SectorMapper	Maps victim sector and critical-infrastructure status.
ActorMapper	Maps actor names, aliases, ransomware brands, campaigns, and confidence.
CVEResolver	Maps vulnerabilities to CVEs, affected products, KEV status, and exploit relevance.

Output:

{
  "victim": {
    "name": "",
    "domain": "",
    "country": "",
    "sector": "",
    "critical_infrastructure": false
  },
  "actor": {
    "name": "",
    "aliases": [],
    "campaign": "",
    "confidence": "low | medium | high"
  },
  "jurisdiction": {
    "primary_cert": "",
    "law_enforcement_route": "",
    "sector_isac": ""
  }
}

5.4 Classifyline

Worker	Job
Classifier	Assigns incident type, severity, internal class, and response SLA.
TLPGuard	Ensures TLP data cannot be routed to destinations that cannot receive it.
DestinationPolicyGuard	Blocks inappropriate, illegal, excessive, or sensitive submissions.

Internal class mapping:

Internal class	Meaning	External severity
A	Imminent harm or attack likely underway	Critical
B	Credible planned attack	High
C	Confirmed exposure	High / Medium
D	Campaign intelligence	Medium / High
E	Weak signal or watchlist item	Low / Monitor

Output:

{
  "class": "A | B | C | D | E",
  "severity": "low | medium | high | critical",
  "tlp": "RED | AMBER | GREEN | CLEAR",
  "incident_type": "ransomware | credential_leak | access_sale | phishing | malware | exploit | botnet | data_leak",
  "policy_blocks": []
}

5.5 Sealine

Sealine replaces the old primary concept of “sanitization.” The objective is not to destroy useful evidence, but to protect it.

Worker	Job
EvidencePackager	Collects sensitive evidence, hashes it, and packages it with metadata.
Sealer	Encrypts evidence for authorized recipients using public-key or hybrid encryption.
KeyBurner	Destroys local unwrapped evidence keys after successful sealing.
RetentionGuard	Enforces retention, deletion, plaintext destruction, and crypto-erasure policy.

Sealine principle:

Preserve the truth. Seal the sensitive evidence. Route only what each recipient is authorized to receive.

Output:

{
  "sealed_evidence": {
    "package_id": "uuid",
    "encryption": "age | PGP | CMS | hybrid",
    "recipient_keys": [
      {
        "recipient": "CERT-Bund",
        "key_id": "authority-key-id",
        "wrapped_key": "encrypted-evidence-key"
      }
    ],
    "payload_hash": "sha256",
    "plaintext_destroyed": true,
    "local_unwrapped_key_destroyed": true
  }
}

5.6 Routeline

Worker	Job
RoutePlanner	Chooses destination order based on victim, country, sector, severity, TLP, and evidence type.
PayloadBuilder	Builds destination-specific payloads: sealed package, STIX bundle, MISP event, abuse report, or public-safe extract.
Redactor	Minimizes public/semi-public outputs only. Redactor does not replace Sealer.
Courier	Submits through API, portal, structured email, or secure upload.
RateLimiter	Enforces destination quotas, retries, and backoff.
ReceiptCollector	Captures case IDs, acknowledgements, API responses, and status URLs.

Example route object:

{
  "routes": [
    {
      "destination": "CERT-Bund",
      "type": "authority",
      "payload": "sealed_evidence_package",
      "priority": 1,
      "max_tlp_allowed": "RED"
    },
    {
      "destination": "MISP trusted community",
      "type": "cti_sharing",
      "payload": "stix_indicators",
      "priority": 2,
      "max_tlp_allowed": "AMBER"
    },
    {
      "destination": "Cloudflare Abuse API",
      "type": "provider_abuse",
      "payload": "minimized_abuse_report",
      "priority": 3,
      "max_tlp_allowed": "CLEAR"
    }
  ]
}

5.7 Ledgerline

Worker	Job
Ledger	Creates immutable audit records for all external submissions and destructive actions.
Watcher	Polls outcomes: takedown status, MISP sightings, CERT acknowledgement, provider response.
Archivist	Handles retention, sealed package lifecycle, legal holds, and crypto-erasure confirmation.

Ledger record:

{
  "timestamp": "2026-05-13T00:00:00Z",
  "case_id": "B48-2026-000001",
  "destination": "CERT-Bund",
  "payload_hash": "sha256",
  "submitter_identity": "blue48-official-handle",
  "tlp": "AMBER",
  "response_id": "external-case-id",
  "outcome": "submitted | acknowledged | rejected | actioned"
}

5.8 Publishline

Worker	Job
Publisher	Produces public-safe intelligence reports after mitigation and approval.

Publisher may include:

sector trend
actor trend
CVEs
TTPs
defensive recommendations
sanitized IOCs
non-sensitive timelines

Publisher must not include:

raw credentials
stolen data
victim secrets
live access details
exact criminal-source links
unmitigated exploit paths

6. Which Workers Need Models?

Worker	Model need
SourcePlanner	None / rules
Crawler / Fetcher	None
Parser	Rules / small model
Deduper	Embeddings / rules
Signalizer	Small or medium model
ClaimChecker	Small or medium model
ConfidenceScorer	Medium model
EntityResolver	Rules + embeddings
ActorMapper	Small or medium model
Classifier	Small or medium model
RoutePlanner	Rules first, model second
PayloadBuilder	Small model
Publisher	Medium or large model
ExampleBuilder	Medium model
QualityGate	Medium model + rules

Heavy models should be reserved for:

ConfidenceScorer
Classifier
Publisher
ExampleBuilder
QualityGate

7. Human Review Boundaries

Human approval is required before:

sending sealed evidence to any external destination
contacting law enforcement or CERTs with sensitive evidence
publishing a public advisory
destroying plaintext evidence
destroying local unwrapped evidence keys
exporting a training dataset
modifying routing policy
modifying recipient keys

Two-person control should be required for:

sending TLP:RED or highly sensitive packages
deleting evidence
changing authority recipient keys
publishing named-victim reports
exporting training data based on internal cases

8. MVP Worker Build Order

Initial worker implementation priority:

SourcePlanner
Fetcher
Parser
Deduper
Signalizer
IOCChecker
EntityResolver
GeoResolver
Classifier
EvidencePackager
Sealer
RoutePlanner
Courier
Ledger
ReceiptCollector
IntelMiner

Minimum operational chain:

Fetcher → Parser → Signalizer → IOCChecker → EntityResolver → Classifier → Sealer → RoutePlanner → Courier → Ledger

9. Technical Notes

Recommended implementation style:

Component	Recommendation
Worker runtime	Python services, Celery, Temporal, Prefect, or lightweight queue workers
Message format	JSON normalized case object
Interop format	STIX 2.1 where useful
Storage	PostgreSQL + object storage
Search	OpenSearch or Meilisearch
CTI graph	OpenCTI or MISP integration
Audit	append-only ledger table
Secrets	`.env`, secret manager, runtime injection only
UI	Blue48 Operations Cockpit

10. Summary

Blue48 should operate as a worker mesh, not a monolithic AI agent.

The system should use small deterministic workers where possible, small models where useful, and larger models only for judgment-heavy steps. Sensitive evidence is handled by Sealine, not casually rendered or distributed. Routing and public reporting are controlled by policy guards, human review, and immutable audit logging.

Blue48 IntelMiner and LoRA Training Data Pipeline

Source: intelminer.md

Document type: Project record / technical concept
Scope: Lawful intelligence collection, training-data preparation, LoRA dataset format, quality gates, safety boundaries
Status: Draft v1

1. Purpose

IntelMiner is the Blue48 worker responsible for collecting lawful defensive cyber-intelligence and converting it into reviewed, license-safe, LoRA-ready training examples.

IntelMiner does not train models to hack. It prepares training data for defensive tasks such as indicator extraction, routing, severity classification, evidence handling, and safe report writing.

Core mission:

IntelMiner collects lawful defensive cyber-intelligence from approved online sources and transforms it into reviewed, license-safe, LoRA-ready JSONL examples for specialized defensive models.

2. What IntelMiner Should Learn From

Allowed source categories:

national CERT advisories
CISA, ENISA, NCSC, CERT-EU, BSI, ANSSI, and similar public advisories
CVE, NVD, and exploited-vulnerability catalogs
public vendor threat reports
public malware-analysis reports
public ransomware trend reports from lawful monitors
MISP events where the license and sharing group permit reuse
abuse.ch datasets where permitted
public IOCs and defensive detection content
public incident writeups
internally written reports approved for training
synthetic examples written by analysts

Restricted or excluded source categories:

raw stolen data
raw credentials
private victim communications
criminal-forum content obtained without authorization
confidential CTI provider content without training rights
TLP:RED material
material with unknown or incompatible license
content that teaches exploitation, persistence, credential abuse, ransomware operation, or evasion

3. IntelMiner Worker Chain

SourcePlanner
→ Collector
→ LicenseChecker
→ ContentParser
→ Chunker
→ Labeler
→ ExampleBuilder
→ QualityGate
→ ReviewerQueue
→ DatasetWriter

4. Worker Responsibilities

Worker	Responsibility
SourcePlanner	Defines approved sources, update schedules, license expectations, and collection priority.
Collector	Pulls data from APIs, RSS, advisories, STIX/TAXII, MISP, GitHub, PDFs, and public reports.
LicenseChecker	Determines whether the material may be used for training. Blocks unknown or restricted content.
ContentParser	Extracts text, IOCs, dates, actors, CVEs, TTPs, victim sectors, and source metadata.
Chunker	Splits long content into training-sized units while preserving context.
Labeler	Assigns task labels such as IOC extraction, routing, classification, report writing, and evidence handling.
ExampleBuilder	Converts chunks into instruction/input/output training examples.
QualityGate	Removes unsafe, duplicated, mislabeled, low-confidence, or license-problematic examples.
ReviewerQueue	Sends candidates to human reviewers. Nothing enters the final dataset without approval.
DatasetWriter	Exports approved examples as versioned JSONL datasets.

5. Training Tasks

The LoRA adapters should learn defensive operations only.

Task	Purpose
ioc_extraction	Extract domains, IPs, URLs, hashes, emails, wallets, CVEs, and file names.
ttp_mapping	Map report language to MITRE ATT&CK-style techniques.
severity_classification	Classify weak signal, credible threat, confirmed exposure, campaign intelligence, or imminent harm.
routing_decision	Decide which reporting destinations are appropriate and in what order.
evidence_handling	Decide whether evidence must be sealed, minimized, excluded, or internally retained.
actor_normalization	Normalize actor names, aliases, ransomware brands, and campaigns.
source_reliability	Estimate source reliability and information credibility.
report_drafting	Draft structured victim, CERT, provider, MISP, or public reports.
public_publishing	Produce sanitized public intelligence after mitigation.

Do not train examples for:

exploitation steps
credential abuse
phishing construction
malware deployment
ransomware operations
evasion
stealth
persistence
unauthorized forum access
instructions for obtaining stolen data

6. Recommended LoRA Strategy

Do not start by training one large mixed LoRA. Start with small task-specific adapters.

Recommended adapter order:

Priority	Adapter	Reason
1	lora-router	Central to the project and easier to evaluate objectively.
2	lora-ioc-extractor	High utility, clear labels, measurable precision and recall.
3	lora-evidence-handler	Helps enforce safe handling decisions.
4	lora-report-writer	Drafts structured notifications after reviewed facts exist.
5	lora-actor-normalizer	Improves actor and campaign mapping.
6	lora-public-publisher	Produces public-safe summaries after mitigation.

Training should begin only after enough reviewed examples exist:

1,000+ reviewed examples for a single narrow task, or
3,000–10,000 mixed examples across several tasks.

Until then, use rules, retrieval, embeddings, and human-reviewed prompts.

7. JSONL Training Format

Each JSONL line should contain one training example.

Standard structure:

{
  "task": "routing_decision",
  "instruction": "Given a defensive cyber-intelligence signal, choose the correct reporting destinations and order.",
  "input": {},
  "output": {},
  "metadata": {
    "source_type": "public_advisory | vendor_report | synthetic | internal_approved",
    "tlp": "CLEAR | GREEN | AMBER",
    "license": "approved",
    "reviewed": true,
    "policy_version": "v1",
    "dataset_version": "dataset-router-v0.1"
  }
}

8. Example: IOC Extraction

{
  "task": "ioc_extraction",
  "instruction": "Extract defensive indicators from the cyber threat report. Return JSON only.",
  "input": "A phishing campaign used login-example[.]com and delivered payload hash 44d88612fea8a8f36de82e1278abb02f. The actor referenced CVE-2024-12345.",
  "output": {
    "domains": ["login-example.com"],
    "hashes": ["44d88612fea8a8f36de82e1278abb02f"],
    "cves": ["CVE-2024-12345"],
    "ips": [],
    "urls": []
  },
  "metadata": {
    "source_type": "synthetic_or_public_report",
    "tlp": "CLEAR",
    "license": "approved",
    "reviewed": true
  }
}

9. Example: Routing Decision

{
  "task": "routing_decision",
  "instruction": "Given a defensive cyber-intelligence signal, choose the correct reporting destinations and order.",
  "input": {
    "incident_type": "access_sale",
    "victim_country": "DE",
    "sector": "energy",
    "critical_infrastructure": true,
    "confidence": "high",
    "tlp": "AMBER"
  },
  "output": {
    "severity": "critical",
    "routes": [
      "CERT-Bund",
      "victim_security_team",
      "sector_isac",
      "law_enforcement_cyber_unit",
      "misp_trusted_community"
    ],
    "evidence_handling": "authority_sealed_package"
  },
  "metadata": {
    "reviewed": true,
    "policy_version": "v1"
  }
}

10. Example: Evidence Handling

{
  "task": "evidence_handling",
  "instruction": "Decide how evidence should be handled before external submission.",
  "input": {
    "evidence_type": "stolen_credentials",
    "destination": "public_abuse_api",
    "contains_pii": true,
    "tlp": "RED"
  },
  "output": {
    "submit_raw": false,
    "handling": "do_not_send_raw_to_public_api",
    "allowed_payload": "metadata_only",
    "sealed_package_required": true,
    "authorized_recipients": ["victim_security_team", "national_cert"]
  },
  "metadata": {
    "reviewed": true
  }
}

11. Dataset Metadata

Every example should include metadata.

Field	Purpose
`task`	Training task category.
`source_type`	Origin category of the example.
`source_id`	Internal reference to source document.
`license`	Approved, restricted, unknown, or rejected.
`tlp`	CLEAR, GREEN, AMBER, or RED.
`reviewed`	Human approval status.
`reviewer_id`	Internal reviewer identity or role ID.
`policy_version`	Version of handling policy used.
`dataset_version`	Versioned dataset name.
`safety_flags`	Unsafe content or sensitive material flags.
`dedupe_hash`	Used to prevent duplicate examples.

12. QualityGate Rules

QualityGate must reject examples that contain:

raw credentials
raw stolen data
private victim information
live access details
exploit chains
malware deployment steps
phishing instructions
evasion or persistence guidance
incompatible license
unknown provenance
duplicated content
unreviewed TLP:RED or confidential content

QualityGate should flag for human review when:

source license is ambiguous
actor attribution is uncertain
victim identity is named
sample contains personal data
output teaches operationally sensitive details
example conflicts with policy

13. Dataset Builder UI Requirements

IntelMiner should be visible in the Blue48 Operations Cockpit.

Screens:

Screen	Purpose
Dataset Sources	Manage approved sources, license status, and collection schedules.
Training Candidate Queue	Review generated examples before approval.
Example Review	Edit, approve, reject, or mark examples unsafe.
Dataset Builder	Export versioned JSONL datasets with train/validation split.
Dataset Audit	Track source, reviewer, license, and policy version.

Candidate fields:

Field	Meaning
Task	IOC extraction, routing, classification, etc.
Source	advisory, blog, report, synthetic, internal.
License	approved, restricted, unknown, rejected.
Quality score	Estimated usefulness.
Safety flag	safe, needs review, reject.
Reviewer status	pending, approved, rejected.

14. Dataset Versioning

Datasets should be versioned clearly:

dataset-router-v0.1
dataset-ioc-extractor-v0.3
dataset-evidence-handler-v0.2
dataset-report-writer-v0.2

Each export should include:

dataset name
version
date
number of examples
task distribution
source distribution
license distribution
reviewer count
rejected example count
train/validation split
policy version

15. Human Review Requirements

Human approval is required before examples become training data.

Reviewers should check:

factual correctness
source license
safety boundaries
absence of raw sensitive data
correct label
useful expected output
no attacker-enabling content

Two-person review is recommended for:

internal case-derived examples
sensitive incident examples
actor attribution examples
routing examples involving law enforcement or critical infrastructure
examples derived from TLP:AMBER material

TLP:RED material should not be used for LoRA training unless an explicit legal, operational, and governance policy exists.

16. Summary

IntelMiner is the bridge between Blue48 operations and future specialized defensive models.

It should collect only lawful and approved data, check license and safety constraints, build structured examples, require human review, and export versioned JSONL datasets. The first LoRA should likely be lora-router, followed by lora-ioc-extractor and lora-evidence-handler.

Blue48 Operations Cockpit — GUI / UI-UX Concept

Source: blue48_operations_cockpit_ui_ux.md

Document type: Project record / technical concept
Scope: GUI, operator workflow, worker observability, evidence handling, routing review, and IntelMiner dataset operations
Status: Draft v1

1. Purpose

The Blue48 Operations Cockpit is the human-facing command center for the worker mesh.

The GUI must let operators see, review, approve, seal, route, audit, and publish cyber-intelligence cases without losing control of sensitive evidence or outbound submissions.

The core principle is:

The system may automate collection, enrichment, packaging, and routing, but humans must clearly see the chain of reasoning, evidence status, risk level, and outbound submissions before anything sensitive leaves the platform.

The GUI should not be a decorative dashboard first. It should be an operational cockpit.

2. Core Control Surfaces

The product should be designed around six main control surfaces:

Control Surface	Primary Question Answered
Cases	What is happening?
Evidence	What is protected?
Routing	Where will it go?
Workers	What produced this result?
Ledger	What can we prove happened?
Trainline	What can become safe training data?

These six areas should drive navigation, permissions, and MVP scope.

Recommended sidebar navigation:

OPERATIONS
- Mission Control
- Case Queue
- Worker Mesh
- Routing Review
- Receipts

EVIDENCE
- Evidence Vault
- Sealed Packages
- Retention

INTELLIGENCE
- Reports
- MISP / STIX Events
- Public Advisories

TRAINING
- IntelMiner
- Training Candidates
- Dataset Builder

SYSTEM
- Integrations
- Policy Engine
- Ledger
- Admin

Minimal route structure:

/dashboard
/cases
/cases/:id
/cases/:id/evidence
/cases/:id/routing
/receipts
/workers
/trainline/candidates

4. Mission Control

Mission Control is the landing dashboard.

Its purpose is to show what is happening right now.

Key Widgets

Widget	Shows
Active Signals	New unreviewed leads from Scoutline
Critical Queue	Imminent harm / critical infrastructure cases
Pending Human Review	Cases waiting for analyst approval
Sealed Evidence Packages	Evidence encrypted and ready for authority handoff
Outbound Reports	Reports waiting to be sent
Receipts / Acknowledgements	CERT, MISP, abuse API, and provider responses
Worker Health	Workers running, degraded, failed, paused, or stopped
Rate Limits	API quota usage per destination
Legal / TLP Warnings	Items blocked by policy guard

Suggested Layout

┌──────────────────────────────────────────────────────────────┐
│ Blue48 Operations Cockpit                                    │
├──────────────┬──────────────┬──────────────┬────────────────┤
│ Critical     │ Pending      │ Sealed       │ Submitted      │
│ Cases        │ Review       │ Packages     │ Reports        │
├──────────────┴──────────────┴──────────────┴────────────────┤
│ Live Worker Mesh Timeline                                    │
├──────────────────────────────┬───────────────────────────────┤
│ Priority Case Queue          │ Destination / API Health       │
├──────────────────────────────┴───────────────────────────────┤
│ Recent Receipts and Outcomes                                  │
└──────────────────────────────────────────────────────────────┘

5. Case Queue

The Case Queue is the main daily-use screen.

Each row represents one signal or incident candidate.

Recommended Columns

Column	Meaning
Case ID	Unique internal case identifier
Class	A/B/C/D/E or Critical/High/Medium/Low
TLP	RED / AMBER / GREEN / CLEAR
Confidence	Low / Medium / High or Admiralty Code
Victim	Organization, domain, or unknown
Country	Used for CERT routing
Sector	Healthcare, finance, energy, government, etc.
Incident Type	Access sale, ransomware, phishing, credential leak, botnet, exploit, data leak
Actor	Known group / suspected actor / unknown
Current Worker	Worker currently responsible for the case
Next Action	Review, seal, route, submit, wait, archive
Deadline	SLA based on severity
Owner	Assigned analyst

Example row:

[CRITICAL] [TLP:AMBER] DE energy provider | access sale | high confidence | Sealer ready | Review required

Filters

The queue should support filters for:

severity
class
TLP
country
sector
actor
source type
confidence
pending approval
failed submission
critical infrastructure only
worker state

6. Case Detail View

The Case Detail View is where analysts work on a single case.

Recommended tabs:

Overview | Evidence | Timeline | Worker Output | Routing | Reports | Receipts | Audit

Overview Tab

The Overview tab should show:

case summary
severity
class
confidence
affected entity
actor
jurisdiction
recommended route
current state
required approval

Example:

Case: B48-2026-000184
Type: Initial Access Sale
Severity: Critical
TLP: AMBER
Confidence: High
Victim Country: Germany
Sector: Energy
Recommended Route:
1. CERT-Bund
2. Victim Security Team
3. Sector ISAC
4. MISP Trusted Community

7. Evidence View

The Evidence View is where the Sealer concept appears in the GUI.

Raw sensitive evidence should not casually render by default.

The UI should show evidence status instead of exposing raw contents.

Evidence Status Labels

Status	Meaning
Unsealed	Evidence exists internally but has not been authority-sealed
Sealed	Evidence has been encrypted for selected authorized recipients
Plaintext Destroyed	Local plaintext copy has been removed
Local Key Destroyed	Local unwrapped evidence key has been removed
Recipient Decryptable	Selected authority or victim can decrypt the package
Public-Safe Extract Available	Redacted/minimized metadata is available for public or semi-public destinations

Evidence Display Model

Evidence Package
├── Metadata preview: visible
├── Sensitive content: locked by default
├── Hashes: visible
├── Recipient keys: visible
├── Local decryption access: unavailable after key destruction
└── Chain of custody: visible

Evidence Actions

Recommended actions:

Seal Evidence
Add Recipient Key
Verify Package Hash
Destroy Local Plaintext
Destroy Local Unwrapped Key
Generate Public-Safe Extract
Request Human Approval

The UI should make the trust state obvious:

Raw evidence: locked
Sealed package: ready
Local plaintext: destroyed
Local key: destroyed
Recipient: CERT-Bund can decrypt
Public extract: available

8. Worker Mesh View

The Worker Mesh View is the observability screen for the processing pipeline.

It should show the worker topology and the health of each worker.

Worker Lines

Scoutline
 SourcePlanner → Crawler → Fetcher → Parser → Deduper → SourceRanker

Proofline
 Signalizer → Correlator → IOCChecker → ClaimChecker → ConfidenceScorer

Mapline
 EntityResolver → GeoResolver → SectorMapper → ActorMapper → CVEResolver

Sealine
 EvidencePackager → Sealer → KeyBurner → RetentionGuard

Routeline
 RoutePlanner → PayloadBuilder → Courier → ReceiptCollector

Trainline
 IntelMiner → LicenseChecker → Chunker → Labeler → QualityGate → DatasetWriter

Worker Tile Fields

Each worker tile should show:

Field	Meaning
Status	Healthy, degraded, failed, paused, or stopped
Queue Depth	Number of waiting jobs
Last Run	Most recent execution timestamp
Error Count	Recent failures
Average Processing Time	Performance indicator
Model / API Used	Which model, API, or rule engine was used
Cost Estimate	Optional model/API cost estimate
Last Output Sample	Small safe preview of output
Controls	Retry, pause, resume, open logs

The goal is to prevent the system from becoming a black box.

9. Routing Review Screen

The Routing Review Screen is where humans approve outbound reports.

It should show recommended destinations, payload types, policy decisions, and blocks.

Example:

Recommended destinations:
✓ CERT-Bund — sealed evidence package
✓ Victim Security Team — sealed evidence package
✓ MISP Trusted Community — TLP:AMBER STIX indicators
✓ Cloudflare Abuse — minimized abuse report
✕ VirusTotal — blocked: contains sensitive sample / TLP too high

Destination Fields

Field	Purpose
Destination	CERT, MISP, provider, abuse API, registrar, law enforcement, victim
Payload Type	Sealed package, STIX bundle, minimized abuse report, advisory draft
Max TLP Allowed	Prevents over-sharing
Required Auth	API key, PGP, portal, structured email, OIDC
Rate-Limit Budget	Whether submission can happen now
Policy Status	Allowed, blocked, or needs approval
Legal Status	Safe, review required, or blocked
Expected Receipt	Case ID, acknowledgement, or status URL

Actions

Recommended actions:

Approve Selected Routes
Block Route
Require Legal Review
Send to Sealer
Send to Redactor
Submit Now

The interface should never provide one broad dangerous action such as Send Everything.

10. Report Builder

The Report Builder creates destination-specific outputs.

Report Templates

Template	Used For
Victim Notification	Direct affected organization
CERT Notification	National CERT / CSIRT
Law Enforcement Referral	Criminal activity
Provider Abuse Report	Hosting, CDN, registrar, cloud, email provider
MISP Event	CTI sharing
Public Advisory	Sanitized public report
Training Example	LoRA dataset candidate

Recommended Layout

Left pane: structured case data
Right pane: generated report preview

Warning Flags

The builder should warn when a draft:

contains PII
contains raw credentials
contains TLP:RED material
contains victim name
contains exploit detail
contains unsealed evidence
exceeds destination TLP allowance
targets a public or semi-public platform with sensitive content

11. IntelMiner / Trainline UI

The IntelMiner and Trainline UI should be separate from active operations.

This prevents analysts from confusing live cases with training candidates.

Dataset Sources Screen

Shows:

Field	Meaning
Source Name	Human-readable source name
URL / API	Collection endpoint
License Status	Approved, restricted, unknown, rejected
Allowed for Training	Yes / no / review required
Last Collected	Most recent collection timestamp
Document Count	Number of collected documents
Failure Rate	Recent collection reliability

Training Candidate Queue

Each candidate should show:

Field	Meaning
Task	IOC extraction, routing, classification, report writing, actor normalization
Source	Advisory, blog, report, synthetic, internal
License	Approved, restricted, unknown
Quality Score	Estimated usefulness
Safety Flag	Safe, needs review, reject
Reviewer Status	Pending, approved, rejected, edited

Example Review Screen

The reviewer should see:

Instruction
Input
Expected Output
Metadata
Source License
Safety Flags

Actions:

Approve
Reject
Edit
Send Back to Labeler
Mark as Unsafe
Export to JSONL

Dataset Builder

The Dataset Builder should show:

examples by task
token counts
train/validation split
duplicates
class imbalance
rejected examples
export version

Example dataset versions:

dataset-router-v0.1
dataset-ioc-extractor-v0.3
dataset-report-writer-v0.2

12. Roles and Permissions

The GUI requires strict role-based access control.

Role	Can Do
Viewer	Read dashboards and public-safe summaries
Analyst	Review signals, enrich cases, draft reports
Sealer Officer	Seal evidence and manage recipient keys
Router Officer	Approve destinations and routing decisions
Legal Reviewer	Approve sensitive or cross-border submissions
Admin	Manage users, integrations, policies, and configuration
Dataset Curator	Approve training examples and exports
Auditor	Read ledger and export compliance logs

Two-Person Control

Critical actions should require two-person approval:

send sealed evidence
submit to law enforcement
publish a public advisory
destroy plaintext
destroy local unwrapped evidence keys
export a training dataset
modify routing policy
modify recipient keys

13. Case State Machine

Every case should follow a clear state machine.

Normal States

NEW_SIGNAL
→ PARSED
→ VERIFIED
→ MAPPED
→ CLASSIFIED
→ REVIEW_REQUIRED
→ EVIDENCE_PACKAGED
→ SEALED
→ ROUTE_PROPOSED
→ APPROVED_FOR_SUBMISSION
→ SUBMITTED
→ ACKNOWLEDGED
→ ACTIONED
→ ARCHIVED

Error / Block States

BLOCKED_BY_TLP
BLOCKED_BY_POLICY
NEEDS_LEGAL_REVIEW
DESTINATION_RATE_LIMITED
SUBMISSION_FAILED
INSUFFICIENT_CONFIDENCE
DUPLICATE_CASE

The state machine should be visible in the Case Detail View.

14. UI / UX Principles

Make Risk Visible

Every screen should answer:

What is the severity?
What is the confidence?
Who is affected?
What data is sensitive?
Who can decrypt it?
What will be sent?
Where will it be sent?
What policy allows or blocks this?

Make Automation Interruptible

Analysts must be able to:

pause a worker
block a route
downgrade confidence
require legal review
mark as duplicate
prevent publication
reopen a case

Make Evidence Status Obvious

Use labels such as:

Raw evidence: locked
Sealed package: ready
Local plaintext: destroyed
Local key: destroyed
Recipient: CERT-Bund can decrypt
Public extract: available

Avoid Dangerous UX Patterns

Avoid:

one-click “send all” actions
hidden payloads
unclear TLP labels
buried warnings
irreversible actions without confirmation
publishing controls mixed with private reporting controls
exposing raw evidence by default

15. Minimal MVP GUI

Do not build everything first.

The first useful MVP should include:

Mission Control
Case Queue
Case Detail
Evidence Sealing View
Routing Review
Courier Receipts
Worker Health
IntelMiner Dataset Queue

MVP Routes

/dashboard
/cases
/cases/:id
/cases/:id/evidence
/cases/:id/routing
/receipts
/workers
/trainline/candidates

This MVP is enough to operate safely while keeping the scope manageable.

16. Recommended Technical Stack

Layer	Recommendation
Frontend	React / Next.js
UI Components	shadcn/ui + Tailwind
Charts	Recharts
Workflow Graph	React Flow
Tables	TanStack Table
Backend API	FastAPI
Worker Orchestration	Celery, Temporal, or Prefect
Database	PostgreSQL
Search	OpenSearch or Meilisearch
Graph Intelligence	OpenCTI / Neo4j optional
Object Storage	S3-compatible encrypted storage
Audit Log	Append-only PostgreSQL table or immutability layer
Auth	OIDC / Keycloak
Realtime Updates	WebSockets or Server-Sent Events

React Flow is especially useful for the Worker Mesh screen.

17. Visual Identity

The design should feel:

calm
operational
serious
high-trust
defensive
readable under pressure

Avoid:

cyberpunk styling
hacker neon
gamification
aggressive animation
cluttered dashboards

Recommended style:

Dark mode by default
High-contrast severity labels
Muted blue/gray base
Red only for critical
Amber for warnings
Green for completed/safe
Clear TLP badges
Large readable tables
Minimal animations

Recommended UI language:

Evidence protected.
Route blocked by policy.
Human approval required.
Recipient can decrypt.
Local key destroyed.
Submission acknowledged.

18. Final Operating Model

The GUI should support this operational chain:

Detect
→ Validate
→ Classify
→ Seal Evidence
→ Review Routes
→ Submit Reports
→ Track Receipts
→ Archive Safely
→ Publish Sanitized Intelligence
→ Build Reviewed Training Data

The core cockpit should keep humans in control of five things:

1. Cases — what is happening?
2. Evidence — what is protected?
3. Routing — where will it go?
4. Workers — what produced this result?
5. Ledger — what can we prove happened?

The training workspace adds the sixth:

6. Trainline — what can become safe training data?

19. Summary

The Blue48 GUI should be an operations cockpit, not a passive dashboard.

It must provide:

live case visibility
worker observability
authority-sealed evidence control
human routing approval
TLP and policy enforcement
receipt and outcome tracking
immutable audit visibility
safe IntelMiner training-data review

The first MVP should focus on daily operational safety and decision control before advanced analytics or public-reporting features are added.

API-Eligible Cyber Threat Reporting & Escalation Platforms

Source: waypoints.md

Project purpose: Build a white-hat defensive reporting workflow that can push credible pre-incident or incident intelligence to the right receivers through APIs or structured machine-to-machine channels.

Scope: This document focuses on platforms that support API-based reporting, submission, alert ingestion, or structured intelligence sharing. It excludes direct interaction with criminal forums and excludes sources that only provide manual web forms unless they are still operationally important as a fallback.

Last reviewed: 2026-05-13

1. Recommended Reporting Order

1.1 Normal credible threat against a named organization

Victim security contact / VDP / security.txt
National CERT / CSIRT
Sector ISAC / ISAO
Law enforcement cyber unit
Infrastructure provider abuse API
Threat-intelligence sharing platform
Sanitized public advisory

1.2 Imminent harm or critical infrastructure

National CERT / CSIRT
Victim security team
Law enforcement cyber unit
Sector regulator / ISAC
Infrastructure provider abuse API
Trusted CTI community
Public advisory only after mitigation or authority clearance

1.3 Malicious infrastructure, phishing, malware, or botnet indicators

Platform-specific reporting API
Examples: Cloudflare Abuse Reports API, Spamhaus Submission API, AbuseIPDB, URLhaus, MalwareBazaar, PhishTank, urlscan.io, Google Web Risk, VirusTotal.
CERT / CSIRT
Affected victim
MISP / OpenCTI / trusted CTI sharing
Public sanitized report

2. Tier-1 API Reporting Platforms

These are the strongest fits for automated defensive reporting because they accept machine-readable submissions or support structured threat sharing.

Priority	Platform	Best for	API / submission capability	Use in workflow
1	CISA Automated Indicator Sharing (AIS)	Sharing cyber threat indicators with U.S. government and AIS participants	STIX/TAXII bidirectional indicator sharing	High-confidence indicators, especially campaigns, exploited infrastructure, malware IOCs
2	MISP	Community and private threat-intelligence sharing	REST API, PyMISP, event and attribute creation	Share vetted IOCs, TTPs, victim-agnostic campaign intelligence
3	OpenCTI	Internal or consortium CTI knowledge base	GraphQL API and connectors	Normalize, enrich, and route intelligence before external disclosure
4	Cloudflare Abuse Reports API	Abuse hosted behind or involving Cloudflare	API supports submitting abuse reports, viewing report details, and listing reports	Phishing, malware, abusive hosting, malicious domains using Cloudflare services
5	Spamhaus Submission Portal API	Malicious IPs, domains, URLs, suspicious email content	REST API for suspicious IP/domain/URL/email reports	Reputation/blocklist contribution and takedown-support evidence
6	AbuseIPDB	Malicious IP reputation	API for reporting and checking abusive IP addresses	Scanner, brute-force, spam, probing, attack-source IP reporting
7	URLhaus / abuse.ch	Malware distribution URLs	Community API for downloading and submitting malware URLs	Active malware URL reporting and malware-distribution tracking
8	MalwareBazaar / abuse.ch	Malware sample exchange	Community API for sample upload/download and bulk queries	Malware sample submission and hash enrichment
9	PhishTank	Phishing URL verification	API for phishing URL status checks; community submission workflow	Phishing verification and enrichment
10	urlscan.io	URL detonation, phishing/malware page evidence	Submission API to scan URLs and retrieve results	Safe screenshot/evidence generation, IOC enrichment
11	Google Web Risk Submission API	Unsafe URL submission to Google Safe Browsing ecosystem	Submission API for suspected unsafe URLs; access requires sales/customer-engineer approval	High-scale malicious URL reporting
12	VirusTotal API	File, URL, domain, IP enrichment and submission	API for file upload, URL scan, reports, and comments	Enrichment and submission to multi-vendor analysis ecosystem
13	Netcraft Report API	Phishing, malware, suspicious URLs, emails, files	API for automated threat reporting	Brand abuse, phishing, takedown-support reporting

3. Platform Notes

Type: Government-backed indicator sharing
Best for: High-confidence cyber threat indicators and defensive measures
API style: STIX/TAXII
Good submissions: IPs, domains, URLs, hashes, malware indicators, campaign indicators
Avoid: Victim-identifying details unless necessary and authorized

Operational fit: Use for campaign-level and infrastructure-level reporting, especially when the intelligence may protect multiple organizations.

Source: https://www.cisa.gov/how-automated-indicator-sharing-ais-works

3.2 MISP

Type: Open-source threat-intelligence sharing platform
Best for: Structured CTI sharing inside trusted communities
API style: REST API; PyMISP client
Good submissions: Events, attributes, galaxies, taxonomies, TLP-tagged indicators, sightings
Avoid: Raw stolen data, credentials, or victim-sensitive artifacts without permission

Operational fit: Use as the main trusted-community sharing layer.

Sources:

3.3 OpenCTI

Type: Threat-intelligence platform / knowledge graph
Best for: Internal CTI normalization, enrichment, and case-to-intel routing
API style: GraphQL API
Good submissions: STIX-like entities, observables, reports, relationships, indicators, malware, threat actors
Avoid: Treating OpenCTI itself as the final external reporting destination unless connected to a sharing community

Operational fit: Use as your central intelligence brain before pushing to MISP, CERTs, providers, or reports.

Sources:

3.4 Cloudflare Abuse Reports API

Type: Infrastructure provider abuse reporting
Best for: Phishing, malware, abuse involving Cloudflare-protected assets
API style: REST API
Good submissions: URLs, domains, abuse category, evidence, contact details
Avoid: Large stolen datasets; provide proof and context instead

Operational fit: Use whenever malicious infrastructure resolves through or is protected by Cloudflare.

Sources:

3.5 Spamhaus Submission Portal API

Type: Reputation and abuse-intelligence reporting
Best for: Malicious IPs, domains, URLs, suspicious email content
API style: REST API
Good submissions: IPs, domains, URLs, suspicious raw email/source evidence
Avoid: Unverified mass submissions; maintain high-confidence standards

Operational fit: Use for reliable contribution to reputation systems and anti-abuse communities.

Sources:

3.6 AbuseIPDB

Type: IP reputation and abuse reporting
Best for: Attack-source IP reporting
API style: REST API
Good submissions: Brute force, scanning, spam, exploitation attempts, abusive traffic categories
Avoid: Reporting shared NAT/VPN/cloud IPs without strong evidence

Operational fit: Use as an automated destination for source-IP abuse reports, especially from honeypots, firewalls, and SIEM detections.

Sources:

3.7 URLhaus / abuse.ch

Type: Malware URL exchange
Best for: Active malware distribution URLs
API style: Community API with Auth-Key
Good submissions: URLs directly serving malware payloads
Avoid: Generic phishing pages that do not distribute malware

Operational fit: Use when you can verify a URL is actively distributing malware.

Source: https://urlhaus.abuse.ch/api/

3.8 MalwareBazaar / abuse.ch

Type: Malware sample exchange
Best for: Malware samples, hashes, family tracking
API style: Community API with Auth-Key
Good submissions: Malware samples and related metadata
Avoid: Benign files, sensitive internal documents, or samples that cannot be legally shared

Operational fit: Use after malware handling review, with strict legal and operational controls.

Source: https://bazaar.abuse.ch/api/

3.9 PhishTank

Type: Community phishing clearing house
Best for: Phishing URL verification and community validation
API style: HTTP POST lookup API
Good submissions: Suspected phishing URLs
Avoid: URLs containing victim credentials, tokens, or private data in query strings

Operational fit: Use for phishing intelligence enrichment and community verification.

Sources:

3.10 urlscan.io

Type: URL scanning and investigation platform
Best for: URL detonation, phishing evidence, page screenshots, redirects, IP/domain enrichment
API style: Submission API and search API
Good submissions: Suspicious URLs, phishing pages, malicious landing pages
Avoid: Private internal URLs or sensitive tokens; set scan visibility carefully

Operational fit: Use before provider reporting to create structured, shareable evidence.

Sources:

3.11 Google Web Risk Submission API

Type: Unsafe URL submission into Google’s protection ecosystem
Best for: High-scale phishing/malware URL submissions
API style: Submission API; restricted access
Good submissions: Suspected unsafe URLs that should be evaluated for Safe Browsing protection
Avoid: Assuming access is automatic; Google says access requires contacting sales or a customer engineer

Operational fit: Use when your group has enough volume and quality control to justify access.

Source: https://docs.cloud.google.com/web-risk/docs/submission-api

3.12 VirusTotal API

Type: Multi-vendor malware and URL analysis ecosystem
Best for: URL/file submission, enrichment, analysis reports, community comments
API style: REST API
Good submissions: Suspicious files, URLs, domains, IPs, hashes
Avoid: Uploading confidential files, customer data, private source code, or stolen materials

Operational fit: Use for enrichment and multi-vendor visibility. Use private scanning options if available and appropriate.

Sources:

3.13 Netcraft Report API

Type: Phishing, malware, suspicious URL/email/file reporting
Best for: Phishing and takedown-support reporting
API style: Report API
Good submissions: Malicious URLs, suspicious emails, files, phishing evidence
Avoid: Low-confidence or privacy-sensitive submissions

Operational fit: Use for high-confidence phishing and brand-abuse reporting, especially where takedown support matters.

Sources:

4. Internal Case / Incident Routing Platforms

These platforms are not external public reporting destinations, but they are useful for receiving your detections through APIs and routing them into a proper case workflow.

Platform	Best for	API capability	Workflow role
TheHive	SOC alert-to-case management	TheHive 5 API supports alert creation	Convert signals into triaged investigations
DFIR-IRIS	Collaborative incident response	Alerts API and general API key support	Internal IR case management
ServiceNow SIR	Enterprise security incident response	REST API to write to Security Incident Import table	Enterprise escalation and tracking
Jira Service Management Incidents	Incident workflow automation	Incidents REST API	Lightweight or engineering-driven incident coordination

Sources:

5. Platforms Useful for Monitoring but Not Primary API Reporting Destinations

Platform	API status	Recommendation
Ransomware.live	API available for ransomware victim/group intelligence	Use for monitoring and enrichment, not as the main reporting destination
Shadowserver	RESTful API for report data access; no STIX/TAXII currently	Use for inbound network exposure/threat reports and enrichment
Have I Been Pwned	API for breach account lookups	Use for exposure checks, not submitting new breach reports unless separately arranged
OpenPhish	No public lookup API; offers feed/module model	Use feed or email/manual reporting fallback
Microsoft Defender submissions	Portal and Microsoft Graph threat-submission resources for some Defender scenarios	Use when operating within a Microsoft tenant or Defender workflow

Sources:

6. Practical Routing Matrix

Evidence type	First API destination	Second destination	Internal system
Malicious IP scanning/brute force	AbuseIPDB	Spamhaus if relevant	TheHive / DFIR-IRIS
Malware distribution URL	URLhaus	Google Web Risk / VirusTotal / urlscan.io	MISP / OpenCTI
Malware sample	MalwareBazaar	VirusTotal	TheHive / OpenCTI
Phishing URL	PhishTank / Netcraft / urlscan.io	Google Web Risk / Cloudflare if hosted/proxied there	MISP / TheHive
Cloudflare-proxied abuse	Cloudflare Abuse Reports API	Netcraft / PhishTank if phishing	Internal case platform
Suspicious email infrastructure	Spamhaus Submission API	AbuseIPDB for IPs	MISP / OpenCTI
Campaign-level indicators	CISA AIS / MISP	CERT/CSIRT	OpenCTI
Ransomware victim claim	Victim + CERT first	MISP only sanitized indicators	OpenCTI / TheHive
Leaked credentials/API keys	Victim first	CERT if severe	Internal IR case only
Critical infrastructure threat	CERT/CSIRT first	Victim + law enforcement	Internal restricted case

7. Minimum Viable API Stack

For a new white-hat group, start with:

MISP — trusted sharing and structured IOC exchange.
OpenCTI — central intelligence normalization and knowledge graph.
TheHive or DFIR-IRIS — case management and triage.
AbuseIPDB — automated IP abuse reporting.
URLhaus — malware URL submission.
MalwareBazaar — malware sample submission, only with legal controls.
urlscan.io — URL evidence generation.
Cloudflare Abuse Reports API — infrastructure abuse reports.
Spamhaus Submission Portal API — IP/domain/URL/email reputation reporting.
CISA AIS or national CERT sharing route — campaign-level indicator sharing.

8. Data Handling Rules

Never submit publicly

Raw credentials
API keys or session cookies
Stolen databases
Internal screenshots that identify victims without consent
Exploit instructions
Live access details
Private source code
Sensitive personal data

Safe to submit when verified

IP addresses
Domains
URLs, if they do not contain tokens or PII
File hashes
Malware samples, only where legally allowed
Timestamps
Actor handles
Campaign labels
CVEs
MITRE ATT&CK techniques
Sanitized screenshots
Provider-neutral technical context

9. Recommended Submission Object

Use this normalized object internally before transforming to each API schema.

{
  "case_id": "WG-2026-000001",
  "tlp": "AMBER",
  "severity": "A|B|C|D|E",
  "confidence": "low|medium|high",
  "threat_type": "phishing|malware|ransomware|credential_exposure|iab|botnet|vulnerability_exploitation",
  "victim": {
    "organization": "",
    "domain": "",
    "country": "",
    "sector": ""
  },
  "source": {
    "category": "forum|leak_site|telegram|honeypot|sensor|osint|tip",
    "first_seen": "",
    "last_seen": "",
    "collection_method": "lawful_osint_or_partner_feed"
  },
  "observables": {
    "ips": [],
    "domains": [],
    "urls": [],
    "hashes": [],
    "emails": [],
    "wallets": [],
    "cves": []
  },
  "evidence": {
    "summary": "",
    "sanitized_screenshots": [],
    "raw_evidence_location": "internal_restricted_storage"
  },
  "recommended_actions": [],
  "routing": {
    "primary_destinations": [],
    "secondary_destinations": [],
    "public_disclosure_allowed": false
  }
}

10. Final Recommendation

The most practical API-driven architecture is:

Sensors / CTI sources → OpenCTI → TheHive or DFIR-IRIS → routing engine → MISP + provider abuse APIs + CERT/AIS channels → sanitized public reporting

This keeps the group legally safer, avoids amplifying criminal material, and creates a repeatable path from early warning to real defensive action.

Review — API-Eligible Cyber Threat Reporting & Escalation Platforms (Draft v1)

Source: waypoints_firstpass.md

Reviewer: Claude (Opus 4.7, 1M context) Review date: 2026-05-13 Document reviewed: waypoints.md (first draft) Verdict: Strong bones. Tone-perfect for white-hat defensive work — machine-to-machine, no vigilante framing. Publishable as an internal whitepaper after the critical fixes below.

1. What's Already Solid

Don't change these — they're load-bearing and correct.

Section 1.1 vs 1.2 split (normal vs imminent harm) — exactly the right hinge for routing decisions.
Section 8 (never-submit list) — covers GDPR / exploitation amplification / credential leakage failure modes well.
Section 9 normalized object — the right abstraction. Transform-to-target instead of N bespoke pipelines.
Section 10 architecture sentence — the whole project on one line: Sensors → OpenCTI → TheHive/IRIS → routing engine → MISP + abuse APIs + CERT/AIS → sanitized public.

2. Critical Fixes (do these before this leaves draft)

2.1 Geography mismatch — CISA AIS at #1 is US-only

For European-focused work, MISP via CIRCL.lu (Luxembourg) or the ENISA CSIRTs Network is the workhorse. CISA AIS does not cover EU institutions.

Action: Swap priorities #1 ↔ #2 (MISP first, AIS second). Add a row for CERT-EU specifically for European institutions.

2.2 National CERTs are referenced generically but never named

The doc says "National CERT/CSIRT" everywhere but never resolves it to an actionable receiver.

Action: Add a small table after Section 1:

Country	Receiver	Channel
DE	BSI / CERT-Bund	reports@cert-bund.de, MISP community
FR	ANSSI / CERT-FR	TAXII feed
UK	NCSC-UK	structured email + early-warning service
NL	NCSC-NL	MISP
ES	CCN-CERT, INCIBE-CERT	MISP
EU	CERT-EU, Europol EC3	TLP-tagged MISP

The routing engine should pick the right one based on victim country.

Note on Europol EC3: they handle criminal cases, not first-call technical sharing. Route through your national CERT first; EC3 receives via national channels for cross-border coordination.

2.3 Domain registrar abuse is missing from Section 1.3

Cloudflare is covered, but registrars (Namecheap, Tucows, GoDaddy, EURid for .eu, DENIC for .de) are often the faster takedown path.

Action: Add to the malicious-infrastructure flow: registrar abuse contact from WHOIS → registrar abuse API/email → registry as escalation.

2.4 Severity scale `A|B|C|D|E` is unusual and undefined

Either define it inline or replace with the standard low|medium|high|critical (CVSS-style) or NIS2 severity categories for EU consistency. Receivers will normalize anyway — but defining it lets the routing engine make automatic decisions.

2.5 Normalized object missing an `actor` block

You have victim but no actor. Add:

"actor": {
  "name": "Adira",
  "aliases": [],
  "campaign": "",
  "confidence": "A1|A2|B1|B2|C2|C3|D|E|F"
}

This field connects the doc to the project mission and lets the routing matrix differentiate actor-specific sightings from generic abuse reports.

(A1–F is the Admiralty Code, the de-facto CTI standard. If that's too much, fall back to low|medium|high.)

Section 9 has observables.emails: []. Submitting victim email addresses to AbuseIPDB or VirusTotal is a personal-data transfer under GDPR.

Action: Add a pre-submission sanitizer step that:

Hashes / redacts emails to local-part-hash@domain when destination is public
Strips PII from URLs (tokens, query params containing identifiers)
Keeps raw originals only in evidence.raw_evidence_location (internal-only storage)

This belongs in the doc before the normalized-object section, not as an afterthought.

3. High-Value Additions

3.1 TLP enforcement at the routing layer

Nothing in the current schema prevents TLP:RED data being routed to a TLP:CLEAR destination.

Action: Add a routing precondition: submission.tlp <= destination.max_tlp_allowed.

CISA AIS rejects TLP:RED
Cloudflare doesn't care
Spamhaus has its own rules
MISP communities each have their own ceiling

Encode the ceiling per destination in the routing matrix.

3.2 STIX 2.1 as the serialization

Right now the doc implies internal object → bespoke transform per API. Cheaper and more standard:

internal object → STIX 2.1 bundle → minor adapter per destination

MISP, OpenCTI, CISA AIS, and most CTI tools are STIX-native. One serializer beats thirteen, and you get free interop with anything that already speaks STIX.

3.3 Rate-limit budgets

Many of these APIs have strict limits:

AbuseIPDB free tier: 1000 reports/day
VirusTotal public API: 4 req/min
Spamhaus: per-submitter quotas
Cloudflare: per-account rate limits

Without a token-bucket per destination, high-confidence submissions get silently dropped during bursts.

Action: Add a destination_quota field to the routing matrix and an enforcement layer.

3.4 Feedback loop is missing

When you submit to URLhaus, you can poll for status. When you submit to MISP, you get sightings. When you submit to Cloudflare, you get a case number. These should flow back into your OpenCTI graph as evidence-of-effectiveness.

Without this, you're operating open-loop — you don't know which destinations actually act on your reports.

Action: Add a Section 11 "Receipt and Effectiveness Tracking" that defines:

Per-destination receipt schema (case ID, ack timestamp, outcome status)
Polling cadence per destination
A success metric per destination type (takedowns confirmed, sightings count, classification adopted)

3.5 NoMoreRansom (NMR)

Ransomware.live is listed under monitoring, but if a decryptor research effort produces anything, NMR is the destination.

Action: Add to the routing matrix:

Evidence type	First API destination	Second destination	Internal system
Ransomware decryptor evidence	NoMoreRansom (private channel)	Victim CERT chain	OpenCTI internal only

NMR coordinates so victims can decrypt before the adversary sees the fix — never publish a working decryptor publicly first.

4. Nice-to-Have

4.1 Submitter identity & signing

Register a stable submitter handle with MISP / MalwareBazaar / AbuseIPDB — not a personal account.
Sign internal objects with a project PGP key before they leave the system.
CIRCL and other major MISP communities weight trust by submitter history.

4.2 Audit log requirement

Every external submission writes an immutable row:

(timestamp, destination, payload_hash, submitter_identity, tlp, response_id, outcome)

Legal cover, debugging, and the feedback loop in 3.4 all need this.

4.3 NIS2 callout for critical-infra reporting

EU NIS2 mandates incident reporting from regulated entities within 24h of awareness. If detections involve essential/important entity sectors, the routing engine should flag NIS2 obligation regardless of receiver choice.

4.4 Section ordering

Sections 8 (data handling) and 9 (normalized object) are foundations, not appendices. Move them up to Sections 3–4. Currently a reader hits the platform list before knowing what not to send.

4.5 Confidence convention

low|medium|high is fine, but production CTI commonly uses the Admiralty Code (A1, B2, etc., describing source reliability × information credibility) or estimative language. Mention the convention even if you don't fully adopt it.

5. Implementation Notes (Blue48 Hookup)

This doc is the spec for two components in the agent stack:

report_writer agent outputs Section 9's normalized object as its canonical format.
A routing engine (extension of report_writer, or a 7th agent) consumes that object, applies the matrix in Section 6, and fans out via API adapters.

Agents stop at "produce the normalized object." Human review reads it, decides "yes, ship this to MISP and Cloudflare," and clicks. The routing engine then runs the API calls, captures receipts, and feeds them back to OpenCTI.

5.1 Suggested initial adapters (Block G priority)

MISP (PyMISP)
AbuseIPDB
URLhaus
Cloudflare Abuse Reports
urlscan.io

These five cover ~80% of common evidence types in the routing matrix.

5.2 Secrets handling

Every adapter needs API credentials. They must:

Live in .env (already excluded from image via .dockerignore)
Be passed at container runtime via env_file, never baked into the image
Be rotatable on a schedule (the audit log in 4.2 helps prove non-overlap)

6. Summary

Category	Count	Notes
Critical	6	Geography, CERT mapping, registrar abuse, severity scale, actor block, PII sanitizer
High-value	5	TLP enforcement, STIX 2.1, rate limits, feedback loop, NoMoreRansom
Nice-to-have	5	Signing, audit log, NIS2, ordering, Admiralty Code

After the critical fixes, this is a publishable internal whitepaper and a clear spec for the routing engine. Good draft.

Detailed Review v2 — API-Eligible Cyber Threat Reporting & Escalation Platforms

Source: waypoints_scalpel.md

Reviewer: Claude (Opus 4.7, 1M context) Review date: 2026-05-13 Document reviewed: waypoints.md (first draft) Companion to: waypoints_firstpass.md (v1 executive summary) Scope of this v2: section-by-section findings, cross-cutting gaps, missing categories, revised schema, implementation priorities for blue48.

0. Method

I re-read the draft three times against the following lenses:

Factual / API accuracy — does each platform actually do what's claimed?
Operational correctness — would the routing actually work in practice, or break on first contact with reality?
Legal / compliance — GDPR, NIS2, MLAT, jurisdiction, chain of custody
Threat-model coverage — does this serve the actual project goal (campaign disruption, not individual attribution)?
OPSEC of the reporter — what does the adversary learn from each submission?

Findings below carry confidence tags: [verified], [likely current], [verify before relying on].

1. Section-by-Section Findings

1.1 — Section 1: Recommended Reporting Order

1.1.1 In Scenario 1.1 (normal credible threat), going to the victim first is correct in 90% of cases — but flag the exception.

Insider-attack scenarios reverse this: notifying a victim org whose own admin/employee is the threat actor warns the attacker. For credential-leak cases involving privileged accounts, route CERT-first and let CERT decide whether to notify the victim org's leadership or its security contact. Add a 1.1.bis for "victim contact may itself be compromised."

1.1.2 Scenario 1.2 (imminent harm) is missing a specific decision point.

If the imminent harm is to critical infrastructure (energy, water, healthcare, finance), in EU jurisdictions the NIS2 Directive mandates 24-hour reporting from regulated entities. Your routing engine should detect "victim sector ∈ NIS2 essential/important entity list" and either:

Route the report so the victim can fulfill their NIS2 obligation, OR
(If victim is unreachable) report directly via the relevant national CERT's NIS2 channel, which exists separately from generic CSIRT contact paths

1.1.3 Scenario 1.3 missing receiver categories:

Hosting providers (not just CDNs). Cloudflare is a CDN; the actual origin server is somewhere else (Hetzner, OVH, AWS, DigitalOcean, etc.). A Cloudflare-only report leaves the origin running. Add hosting provider abuse as a parallel step, not after CDN.
Domain registrars via WHOIS-extracted abuse contact, plus registry escalation for ccTLDs (DENIC for .de, AFNIC for .fr, EURid for .eu, Nominet for .uk)
Certificate authorities for compromised cert revocation (Let's Encrypt revoke API for ACME-issued certs; commercial CA abuse contacts for the rest)
DNS providers independent of registrar (Cloudflare DNS, Quad9, Google Public DNS abuse contacts — for blocking, not takedown)

1.1.4 The implicit ordering bias.

The draft optimizes for legal-defensibility (talk to the receiver who can act) but doesn't optimize for operational speed-to-mitigation. For phishing kits with active credential harvesting, the fastest mitigation is often: parallel-fan-out to (CDN, hosting, registrar, browser-block-list providers) simultaneously, then notify CERT as record-keeping. The doc reads as serial when in practice it should be parallel.

1.2 — Section 2: Tier-1 API Reporting Platforms

1.2.1 Missing platforms that belong in Tier 1:

Platform	Why Tier-1	API style
abuse.ch ThreatFox	IOC graph, sibling to URLhaus/MalwareBazaar, accepts indicator submissions with kill-chain context	REST + Auth-Key
abuse.ch YARAify	YARA rule sharing + scanning. Direct fit since `detection_author` emits YARA	REST + Auth-Key
AlienVault OTX (now LevelBlue Labs OTX)	One of the largest free CTI communities. Pulses for sharing, pull API for consumption. Major omission from current draft.	REST + DirectConnect API
CIRCL Hashlookup	Fast hash reputation lookup, free, EU-hosted	REST
Shadowserver	Free network exposure / vulnerability scanning reports. Subscribe by ASN/CIDR/contact. The draft has it under "monitoring" but Shadowserver also accepts submissions and runs important takedown campaigns.	REST API

1.2.2 Reorder by jurisdictional fit:

The current #1 (CISA AIS) is US-government-tied. For Europe-focused work the right Tier-1 priorities are roughly:

MISP (CIRCL communities, plus ENISA CSIRTs Network communities)
OpenCTI (your own knowledge graph)
AlienVault OTX (broad reach, low friction)
CISA AIS (only if US-victim cases or US-relevant indicators)
Cloudflare / hosting abuse APIs
Spamhaus
URLhaus / MalwareBazaar / ThreatFox
AbuseIPDB
urlscan.io
Netcraft

1.2.3 Per-row corrections in the existing table:

CISA AIS — "STIX/TAXII bidirectional" — be specific: STIX 2.1 over TAXII 2.1, with the AIS Profile (a restricted subset of STIX). Submitting non-AIS-Profile STIX gets rejected. [verified]
Cloudflare Abuse Reports API — also requires noting that high-volume submitters can apply to be a Trusted Reporter which gets faster SLAs. [likely current]
VirusTotal API — public submissions are visible to all VT Premium customers (incl. potentially the adversary). The draft doesn't flag this — it's a critical OPSEC point. Use VT Private Scanning for sensitive samples. [verified]
PhishTank — community-vetted. As of late 2024 / early 2025 there were reports of reduced moderation activity. [verify before relying on]. Netcraft is the more reliable phishing-takedown channel today.
Google Web Risk — access truly is gated by Google customer engineering review; not a 5-minute API key signup. Apply early. [verified]

1.3 — Section 3: Per-Platform Notes

3.1 CISA AIS: Add: requires sponsorship from a federal agency or a signed AIS Sharing Agreement, plus the connector software (typically TAXII client). Onboarding measured in weeks, not days. The draft makes it sound like a sign-up form.

3.2 MISP: Missing:

ZeroMQ for real-time push (worth using if you want sub-second propagation to your own consumers)
Distinction between events (point-in-time intelligence) and feeds (continuous streams; better for IOC bulk delivery)
"Create a community" vs "Join a community" tradeoff — joining CIRCL's communities is the lowest-friction entry; creating your own is high-effort and pointless until you have multiple sharing partners
TLP-marking enforcement is not automatic at the MISP level — your client must respect TLP before publishing onward

3.3 OpenCTI: Missing:

The connector framework: ~80+ pre-built connectors (MITRE ATT&CK, MISP, CrowdStrike, Recorded Future, etc.) — most of your enrichment needs are already solved
The Workbench feature for analyst review before publishing
Filigran (the company behind OpenCTI) hosts a managed cloud version if you don't want to operate it yourself

3.4 Cloudflare Abuse Reports API: Missing:

API token requires Account.Abuse Reports permission — won't work with read-only tokens
Rate limits documented separately from the abuse API itself
For Cloudflare-hosted Workers (their serverless), abuse reports go to a different channel
Trusted Reporter program (mentioned above) — apply once you have submission history

3.5 Spamhaus: Missing the lists distinction:

DBL = Domain Block List (domains)
SBL = Spamhaus Block List (IPs)
XBL = Exploits Block List (exploit-sourced IPs)
ZRD = Zero Reputation Domains (newly registered)
Each list has different submission criteria. Wrong-list submissions get rejected. Your routing engine needs a list-selector.

3.6 AbuseIPDB: Missing:

The 23-category taxonomy (SSH brute force, port scan, web app attack, phishing, etc.) — your evidence type must map to an AbuseIPDB category code or the submission is low-utility
Free tier: 1000 reports/day, 100 IP checks/min. Paid tiers scale
Single-reporter submissions have low weight; reputation requires multiple corroborating submitters. Send to AbuseIPDB after sending to other corroborators

3.7 URLhaus: Missing:

Submission auth-key required (free, sign up)
Manual review for high-confidence flags
2024+ stricter format requirements
Linkage to MalwareBazaar — submit the URL to URLhaus, the sample to MalwareBazaar, link by hash

3.8 MalwareBazaar: Missing:

File size limits (~250MB last I checked)
Office macro / Windows installer formats need specific tags
Tag taxonomy is community-driven; non-canonical tags reduce utility
The "Avoid" line about legal-share is correct but vague. Specifically: do not upload samples obtained under NDA, samples from incidents where the victim hasn't consented, or samples that may contain victim PII (e.g., crafted payloads with the victim's name)

3.9 PhishTank: As noted above, declining. Verify status; consider deprioritizing.

3.10 urlscan.io: Missing:

Visibility settings: public, unlisted, private (private = paid)
Public scans are searchable by everyone — including the adversary monitoring for their kits being analyzed
The Search API is invaluable for retrohunts: "show me every scan in the last 30 days that loaded resource X"
Bulk submission via UUID-tagged customagent field for tracking your submission cohort

3.11 Google Web Risk: Missing:

GCP project + Web Risk API enabled prerequisite
Submissions evaluated by Google Safe Browsing pipeline; latency hours-to-days
Successful submissions show up in Chrome / Firefox / Safari Safe Browsing warnings — massive amplification. Use only for high-confidence URLs

3.12 VirusTotal: Missing:

Public API: 4 lookups/min, 500/day, 15.5k/month
Premium API: rate limits negotiated
File submission privacy: anyone with VT Intelligence can see your sample. Critical OPSEC point not in draft.
VT Private Scanning for sensitive samples
VT Hunting (YARA livehunt) for ongoing detection

3.13 Netcraft: Missing:

Strong takedown-execution record — Netcraft actually does the takedown work, not just reporting
Free tier exists for low-volume reporters
Strongest at brand-protection / phishing
They prefer evidence package format: source URL + screenshot + redirect chain + landing page HTML

1.4 — Section 4: Internal Case / Incident Routing Platforms

1.4.1 Missing platforms:

Platform	Best for	Why missing matters
Wazuh	Open-source SIEM with TheHive integration	Many SOCs use it; integrates cleanly with this stack
Microsoft Sentinel	Cloud SIEM with Logic Apps automation	Major enterprise platform — leaving it out makes the doc feel non-enterprise
Splunk SOAR (formerly Phantom)	Commercial SOAR	Major in enterprise SOCs
Cortex XSOAR	Commercial SOAR (Palo Alto)	Same
Shuffle	Open-source SOAR	Free alternative to XSOAR/Phantom
Tracecat	Newer open-source SOAR	Younger but actively developed
n8n	General workflow automation	Not security-specific but widely used as a glue layer

1.4.2 TheHive 5 vs 4: Be explicit — TheHive 4 reached EOL, TheHive 5 is current. Code examples should target TheHive 5 API.

1.5 — Section 5: Monitoring (Not Primary Reporting)

1.5.1 Missing high-value monitoring sources:

Source	What it gives you	API
AlienVault OTX	Largest free pulse community, IOC subscriptions	REST DirectConnect
CIRCL Passive DNS / Passive SSL	Historical DNS / cert lookups; EU-hosted	REST
PhishStats	Phishing URL stream	REST + RSS
DNSDumpster / SecurityTrails / BinaryEdge	Recon/asset-discovery DBs	REST (mostly paid for bulk)
GreyNoise	Benign-scanner classification — reduces false positives in IP reporting by tagging known internet-noise sources	REST
Spamhaus DNSBL queries	Free DNSBL lookups	DNS protocol
Maltrail	Open-source malicious-traffic detection feeds	Static feed download
CT log monitors (crt.sh, Censys CT)	New-cert issuance for your monitored domains — catches phishing-domain registrations	REST

1.5.2 GreyNoise specifically deserves a callout.

Reporting an IP that GreyNoise classifies as benign-scanner (Shodan, Censys, security researchers) gets you blacklisted from AbuseIPDB and embarrasses you with CERTs. Always GreyNoise-check before submitting an IP report. This is a one-line API call that prevents a class of bad submissions.

1.5.3 Shadowserver placement.

Currently in Section 5 (monitoring only) but Shadowserver also runs active sinkholing and takedown campaigns with global reach. They accept tip-offs and IOC contributions. Move them up to Tier 1 receivers, or at least call out the bidirectional relationship.

1.6 — Section 6: Practical Routing Matrix

1.6.1 Missing rows:

Evidence type	First	Second	Internal
Compromised TLS certificate	CT log monitor sighting → CA revocation request	Cloudflare/host if cert is in use	OpenCTI / TheHive
Mobile app malware	Google Play / Apple App Review submission	VirusTotal sample upload	OpenCTI
Cryptocurrency wallet (laundering)	Chainalysis / TRM (commercial) or on-chain analysis	OFAC SDN if sanctioned	Internal restricted case
Open-source supply-chain attack	Registry security (security@npmjs.com, security@python.org)	GitHub Security Lab	OpenCTI / TheHive
Compromised GitHub repo / leaked secret	GitHub Security Advisory + vendor-specific revoke API (e.g., AWS IAM)	Victim org	Internal restricted
Tor hidden service hosting malware	Document only (no takedown for .onion); push IOCs to MISP	n/a	OpenCTI
Sanctions-evasion crypto	OFAC SDN reporting (US) / EU FSF reporting	National FIU	Internal restricted
CSAM (legally separate)	NCMEC CyberTipline (US) / IWF (UK) / INHOPE (international)	National police	Stop processing immediately, preserve under legal hold
Phishing-resistant kit / 2FA bypass	Browser vendor reports (Chrome / Firefox / Safari Trust & Safety)	Affected service	OpenCTI

1.6.2 Cloudflare-proxied abuse needs a follow-up step.

Current row says: First → Cloudflare API; Second → Netcraft / PhishTank. Missing: Third → origin host abuse contact (extracted by sending Cloudflare a HEAD request that bypasses cache, or via certificate transparency cross-reference). Without this, takedown leaves the origin alive and the attacker just provisions a new CDN front-end.

1.6.3 The "Leaked credentials/API keys" row is dangerously thin.

"Victim first → CERT if severe → Internal IR case" — missing the revocation step, which is more time-critical than reporting. If you find a leaked AWS access key, the first action is aws iam delete-access-key via the affected account (with permission) or trigger AWS's automatic key-revocation by submitting to GitHub Secret Scanning. If you find leaked OAuth tokens for GitHub/Slack/etc., the relevant vendor has an automated revocation pathway. Add the revocation step before victim notification.

1.7 — Section 7: Minimum Viable API Stack

The current MVP list (10 items) is too heavy for "minimum viable." A genuine MVP for a new white-hat group is closer to:

OpenCTI — your knowledge graph (or, if too heavy, just MISP for both)
MISP via CIRCL community — free, EU-hosted, broad reach
AlienVault OTX — free, broadest reach for indicator sharing
AbuseIPDB — free tier, easy
URLhaus + MalwareBazaar + ThreatFox (the abuse.ch trio — same auth-key, three destinations)
urlscan.io — free tier, evidence generation
National CERT direct email + GPG — non-API, but mandatory

That's 7 things, of which 5 are pure free signups. Tackle Cloudflare/Netcraft/Spamhaus/GoogleWebRisk after you have throughput in those 7.

The current MVP includes TheHive — that's case management, not external reporting. Move it out of "API stack" since it's internal infrastructure.

1.8 — Section 8: Data Handling Rules

1.8.1 "Never submit publicly" — additions:

Insider-threat allegations without verification
Attribution claims about specific named individuals (the hard line we settled on earlier)
Government / classified material
PHI (US HIPAA scope)
PCI scope financial data
Children's data (COPPA US; GDPR Article 8 EU)
Biometric data
Trade secrets / source code
Material from unauthorized intrusion (even if you got to it via OSINT, "I downloaded their leaked DB" makes you a recipient of stolen goods in some jurisdictions)

1.8.2 "Safe to submit" — additions:

YARA rules (especially to YARAify)
Sigma rules (to SigmaHQ via PR)
Mutex names, named-pipe signatures (good Sysmon detections)
Persistence registry keys
Scheduled task names
TLS fingerprints (JA3, JA4)
HTTP user-agent strings observed in C2
ASN block ranges associated with adversary infrastructure
STIX/TAXII patterns
ATT&CK technique IDs (always)

1.8.3 Missing entire section: "Sanitize before submitting"

Strip URL query parameters that may contain victim tokens / session IDs
Hash email local-parts when target destination is public (a72b91…@example.com)
Redact internal hostnames from samples
Strip x-forwarded-for / source IP from log excerpts that name your honeypot
Replace victim-org names with role descriptors (<european_bank>) unless the submission is to a destination where the victim has consented or the receiver is trusted (CERT)

1.9 — Section 9: Recommended Submission Object

1.9.1 Schema gaps (additions in bold):

{
  "case_id": "WG-2026-000001",
  "schema_version": "1.0",
  "tlp": "AMBER",                            // use TLP 2.0 values: CLEAR/GREEN/AMBER/AMBER+STRICT/RED
  "tlp_marking_definition_ref": "marking-definition--...",  // STIX-compatible
  "severity": "low|medium|high|critical",   // replace A-E with standard
  "confidence": "low|medium|high",          // or Admiralty A1-F6
  "language": "en",                         // i18n
  "first_observed": "2026-05-13T10:00:00Z", // top-level
  "last_observed":  "2026-05-13T11:30:00Z",
  "valid_from":     "2026-05-13T10:00:00Z", // STIX-style validity window
  "valid_until":    "2026-08-13T10:00:00Z",
  "threat_type": "phishing|malware|ransomware|credential_exposure|iab|botnet|vulnerability_exploitation",

  "victim": {
    "organization": "",
    "domain": "",
    "country": "",
    "sector": "",
    "nis2_category": "essential|important|n/a",   // for EU NIS2 routing
    "consent_to_name_publicly": false             // sanitization gate
  },

  "actor": {
    "name": "Adira",
    "aliases": [],
    "campaign": "",
    "confidence": "A1|A2|...|F6"
  },

  "kill_chain": ["recon|weapon|deliver|exploit|install|c2|action"],
  "attack_techniques": ["T1566.001", "T1059.003"],

  "source": {
    "category": "forum|leak_site|telegram|honeypot|sensor|osint|tip",
    "first_seen": "",
    "last_seen": "",
    "collection_method": "lawful_osint_or_partner_feed",
    "burn_sensitivity": "low|medium|high"        // affects sanitization aggressiveness
  },

  "observables": {
    "ips": [],
    "domains": [],
    "urls": [],
    "hashes": [],
    "emails": [],
    "wallets": [],
    "cves": [],
    "yara_rules": [],
    "sigma_rules": [],
    "mutexes": [],
    "named_pipes": [],
    "scheduled_tasks": [],
    "registry_keys": [],
    "user_agents": [],
    "tls_fingerprints": [],                     // JA3/JA4
    "certificates": [],                         // CT log entries / SHA256 of cert
    "asn_blocks": [],
    "process_names": []
  },

  "pattern_relationships": [
    {"source": "domain:example.com", "type": "resolves_to", "target": "ipv4:1.2.3.4", "first_seen": "..."}
  ],

  "evidence": {
    "summary": "",
    "sanitized_screenshots": [],
    "raw_evidence_location": "internal_restricted_storage",
    "detonation_results": [],                   // sandbox report references
    "memory_artifacts": []                      // forensic, internal only
  },

  "timeline": [
    {"ts": "...", "event": "..."}
  ],

  "indicators_of_compromise": [],               // observables flagged as actively malicious

  "recommended_actions": [],

  "routing": {
    "primary_destinations": [],
    "secondary_destinations": [],
    "public_disclosure_allowed": false,
    "embargo_until": null,                      // timed disclosure
    "coordinated_with": []                      // who else has been told (CERT case IDs etc)
  },

  "audit": {
    "submitted_to": [],                         // append-only history of submissions
    "feedback_received": [],                    // ack IDs, takedown confirmations
    "submitter_identity": "wg-handle@misp",     // which submitter handle was used
    "signed_with": "PGP fingerprint",
    "object_sha256": ""                         // tamper-detect on the object itself
  }
}

1.9.2 Other schema concerns:

case_id format WG-2026-000001 is fine, but reserve a 2-char org prefix to avoid collision if you ever federate with another working group
tlp should use TLP 2.0 spec values (CLEAR, GREEN, AMBER, AMBER+STRICT, RED) — TLP 1.0 used different terms
Severity / confidence mismatch in v1: severity used A-E, confidence used words. Standardize.
Add a per-object hash so the routing engine can detect tampering between produce-time and submit-time

1.10 — Section 10: Final Recommendation

1.10.1 The architecture sentence is missing the feedback edge.

Current: Sensors → OpenCTI → TheHive/IRIS → routing engine → MISP + abuse APIs + CERT/AIS → sanitized public reporting

Better: Sensors → OpenCTI → TheHive/IRIS → routing engine → MISP + abuse APIs + CERT/AIS → sanitized public reporting → receipts and outcomes back to OpenCTI → effectiveness scoring → re-prioritization

Without the feedback edge, you can't tell which destinations are worth maintaining.

1.10.2 Missing entirely: closing checklist for "we're ready to submit."

A final checklist before any external submission fires:

[ ] TLP enforced (object.tlp <= destination.max_tlp)
[ ] Sanitization pass complete (PII stripped per destination policy)
[ ] GreyNoise check (if observables include IPs)
[ ] Quota available (rate-limit budget not exceeded)
[ ] Submitter identity registered with destination
[ ] Object signed
[ ] Audit row written
[ ] Human approver clicked yes (for non-automated tier)

This belongs as Section 11 or as the closing block of Section 10.

2. Cross-Cutting Gaps (Not Tied to Any Section)

2.1 — OPSEC for the Reporters Themselves

Not in the doc at all. If your group is reporting Adira to authorities, Adira may notice — they read MISP communities (those that are open), they read URLhaus (public), and they have visibility into VirusTotal Premium (paid customer).

Required additions:

Submission identity registry: which handle is used on which platform, who has access, rotation schedule
Account-creation OPSEC: don't use personal accounts on submission platforms; create a project handle, use a project email, register with project-owned phone/2FA
Network OPSEC for collection: if you're scraping leak sites or monitoring the adversary's infrastructure, route through a VPN or research-purpose proxy — never the same network as your submission identity
PGP for CERT comms: every national CERT publishes a PGP key. Every email submission to a CERT should be signed and encrypted. Untouched in the draft.

2.2 — Burnt Source Protection

If you have a private collection source (honeypot, infiltrated channel, tipped-off insider), publishing IOCs from it can burn the source. Specifically:

A unique honeypot fingerprint (banner, response timing, listening port) lets the adversary identify which sample came from your honeypot
Publishing a sample with a unique build artifact (your sandbox's hostname in a DNS query, a timestamp matching your detonation window) reveals your detonation infrastructure
Reporting a forum URL while it's still live tips off the forum operator that it's being watched

The doc needs a burn-sensitivity tier on each observable, and a sanitization step that aggressively scrubs source-identifying artifacts before any external submission.

2.3 — Adversary Observability of Your Submissions

Tier each receiver by who can see your submission:

Receiver	Adversary visibility
MISP private community	trusted community only
MISP public community / OTX public pulse	anyone with an account
URLhaus	public — adversary can monitor
MalwareBazaar	public — adversary can detect their sample was uploaded
VirusTotal public submission	every VT Premium customer (incl. potentially adversary)
VT Private Scanning	only your team (paid)
AbuseIPDB	public reputation visible
Cloudflare Abuse Reports	only Cloudflare and the reported asset owner
CERT direct (GPG-encrypted)	only the CERT

The routing engine should display this visibility for each destination during human review.

2.4 — Chain of Custody / Legal Admissibility

If any of this material may end up in a criminal proceeding, chain of custody matters. Specifically:

The raw evidence must be preserved unmodified, with hashes recorded at acquisition time
Any transformation (sanitization, normalization) must be reversible — the routing engine logs the input hash, the transform applied, and the output hash
The submitter identity for each external submission is logged
Witnesses (multi-party access logs) are preferred for high-value evidence

The current evidence.raw_evidence_location field is a placeholder; it needs structure: storage path, hash, acquisition timestamp, acquirer identity.

2.5 — Amplification Risk

Publishing IOCs publicly amplifies awareness — which is good for defenders but bad if:

The IOC includes a compromised legitimate site (you damage the site owner's reputation)
The IOC is for a piece of infrastructure that's about to be used in a sting operation by LE
The IOC reveals an investigation technique still under embargo

A publish-readiness review belongs in Section 1 of the doc, not in the closing checklist.

2.6 — Failure Modes / Retries

What happens when:

URLhaus rejects a submission (malformed, low-confidence flag, duplicate)?
MISP is down for maintenance?
Cloudflare returns 503?
Your submitter identity gets rate-limited?
An API token is revoked mid-batch?

The doc has no resilience layer. Recommend:

Idempotent submission with client-generated IDs (so retries don't double-submit)
Per-destination retry policy (exponential backoff with jitter)
Dead-letter queue for permanent failures — surface in human-review UI
Per-submitter quota tracking, with auto-failover to backup submitter if available

2.7 — Versioning and Maintenance

The doc has no version number, no changelog, no maintainer field, no review cadence. For a living spec like this:

---
schema_version: 1.0
last_reviewed: 2026-05-13
next_review_due: 2026-08-13
maintainer: <project lead>
changelog:
  - 2026-05-13: initial draft
---

API surfaces of these platforms change (Cloudflare deprecations, VT pricing changes, abuse.ch tag taxonomy updates). A quarterly re-validation cadence is sane.

2.8 — Multi-Language Submissions

Many national CERTs prefer or require local language for narrative fields (BSI German, ANSSI French, CCN-CERT Spanish). The submission object's language field (added above) plus a translation step in the routing engine handles this. Currently absent.

3. Missing Categories Entirely

3.1 — Hosting Provider Abuse Channels (most lack true REST APIs)

Provider	Channel	API?
AWS	abuse@amazonaws.com + form	No public REST; AWS responds to email
Google Cloud	https://support.google.com/cloud/answer/2417620	Form-only
Azure	https://msrc.microsoft.com/report/abuse	Form + email
DigitalOcean	abuse@digitalocean.com	Email + status REST
Hetzner	abuse@hetzner.com + form	Form
OVH	abuse@ovh.net + Anti-abuse REST API	Yes
Linode (Akamai)	abuse@linode.com	Email
Vultr	abuse@vultr.com	Email

Treat email-based providers as a different submission class (template + GPG-signed email, with parsed-receipt detection). Worth a Section 11 in the doc.

3.2 — Cryptocurrency / Sanctions

Chainalysis Reactor — commercial, gold standard for on-chain investigations
TRM Labs — commercial alternative
CipherTrace (Mastercard) — commercial
OFAC SDN reporting — for US-sanctioned wallets
EU Financial Sanctions Files (FSF) — for EU sanctions
National FIUs — Financial Intelligence Units, country-specific
Free / open: GraphSense (open-source on-chain analytics), Etherscan (manual)

3.3 — Mobile / App Store

Google Play Protect submissions (for Android malware)
Apple App Review report (for malicious iOS apps)
APKMirror reports (for repackaged apps)
F-Droid security contacts (for compromised FOSS apps)

3.4 — Open-Source Supply Chain

PyPI: security@python.org
npm: security@npmjs.com + vendored auto-revoke for leaked tokens
crates.io: help@crates.io
RubyGems: security@rubygems.org
Maven Central: central@sonatype.org
GitHub Security Lab (research collaboration)
OpenSSF Vulnerability Disclosure (cross-ecosystem coordination)
Sigstore (provenance verification, longer-term)

3.5 — Certificate Authorities

Let's Encrypt: ACME revocation API for ACME-issued certs
Sectigo / DigiCert / GlobalSign / Entrust: abuse contacts in CA/Browser Forum compliance docs
CT log monitors for detection (crt.sh, Censys CT, Google CT)

3.6 — Tor / Dark Web

Limited takedown leverage for .onion services, but worth documenting:

Document via Tor Project's abuse handling page (limited leverage)
Contribute IOCs to DarkOwl, Recorded Future, Flashpoint (commercial dark-web monitoring) if you have access
Push to MISP with tor tag for community awareness

3.7 — CSAM (Legally Separate Pathway)

If CSAM is encountered during collection, stop processing immediately. CSAM has separate legal handling rules:

NCMEC CyberTipline (US)
IWF (Internet Watch Foundation) (UK)
INHOPE (international hotline network)
Possessing CSAM is illegal even for research; do not attempt to verify, document, or share. Report and delete from your systems under documented legal hold.

This deserves a Section 12 with a hard stop: "if encountered, halt and report via the channels below; do not include in any other submission flow."

4. Missing Platforms Worth Adding (Quick List)

Free / Open

AlienVault OTX (huge omission)
ThreatFox
YARAify
CIRCL Hashlookup
CIRCL Passive DNS / Passive SSL
Maltrail feeds
crt.sh / Censys CT
GreyNoise community tier
Spamhaus DNSBL queries
PhishStats

Commercial / Paid (worth listing for completeness)

Recorded Future
Mandiant Advantage (now Google Threat Intelligence)
CrowdStrike Falcon Intelligence
Sekoia.io
Flashpoint
DomainTools (passive DNS / WHOIS history)
RiskIQ (now Microsoft Defender Threat Intelligence)
Anomali ThreatStream

Intelligence Communities (membership-based)

FIRST.org (CSIRT global community)
Trusted Introducer (European CSIRT trust framework)
M3AAWG (Messaging, Malware, Mobile Anti-Abuse Working Group)
APWG (Anti-Phishing Working Group)
Cyber Threat Alliance (commercial CTI sharing)
ENISA CSIRTs Network

5. Implementation Priorities for Blue48

In our agent stack, this doc translates to concrete work:

5.1 — Block G additions (when we get there)

report_writer agent outputs the v2 normalized object (Section 1.9.1 above) as canonical format
New routing_engine component (extension of report_writer, or a 7th agent) — consumes the object, applies routing matrix, fans out via API adapters
Adapter priority order for blue48 v1.0:
1. MISP (PyMISP)
2. AlienVault OTX (REST)
3. AbuseIPDB (REST + category mapping)
4. URLhaus + MalwareBazaar + ThreatFox (shared abuse.ch auth-key)
5. urlscan.io (REST, with private-by-default visibility)
6. Cloudflare Abuse Reports
7. GPG-signed email to BSI / CERT-Bund (since the user is in DE)

5.2 — Schema work

config/submission_schema.json — JSON Schema for the v2 normalized object
config/routing_matrix.yaml — declarative rules: evidence type → destinations, with TLP ceilings and quotas
core/sanitize.py — pre-submission scrubbing per destination policy
core/audit.py — append-only log of every submission, signed
core/tlp.py — TLP 2.0 enforcement

5.3 — Pre-submission gates (before any adapter fires)

1. Schema valid?
2. TLP <= destination ceiling?
3. Sanitization complete?
4. GreyNoise check passes (for IPs)?
5. Quota available?
6. Submitter identity registered with destination?
7. Object signed?
8. Audit row written?
9. Human approver yes (for non-auto tier)?

If any fail → drop into human-review queue with the reason. Never silently skip.

5.4 — Failure / retry layer

Per-destination idempotency keys (client-generated)
Exponential backoff with jitter
Dead-letter queue for permanent failures, surfaced in data/dlq/
Per-submitter quota tracking with auto-failover

6. Summary of v2 Findings

Category	Count	Action
Section-by-section corrections	38	Fold into the draft
New cross-cutting sections needed	8	Add as Sections 11–18
Missing platform categories	7	Each warrants a sub-section
Missing free/open platforms (Tier 1)	5	Add to Section 2
Schema field gaps	17	Adopt v2 schema above
Pre-submission gates not defined	9	Add as closing checklist

After folding these in, the document becomes a publishable internal whitepaper and a complete spec for the blue48 routing engine. The first draft was a confident outline; the v2 turns it into a working manual.

120 KiB Raw Blame History Unescape Escape

Blue48 / Adira Hunt — Consolidated Dossier

Architecture sentence

Core principle

Contents

Blue48 Reporting and API Escalation Architecture v2

1. Purpose

2. Recommended Reporting Order

Normal cases

Imminent harm or critical infrastructure

Malicious infrastructure

Mass exploitation

3. Authority-Sealed Evidence Handling

4. Evidence Protection Models

Model A: Authority public-key encryption

Model B: One-time evidence key wrapped for recipients

5. Destination Minimization

6. Priority Platform Order

7. CERT / CSIRT Routing Map

8. Severity and Class Mapping

9. Normalized Case Object

10. TLP Enforcement

11. API-Eligible Destination Categories

11.1 CTI sharing

11.2 Abuse and takedown

11.3 Internal case-management

12. Registrar and Registry Abuse Flow

13. Rate Limits and Queueing

14. Receipt and Effectiveness Tracking

15. Immutable Audit Log

16. Public Reporting Rules

17. Initial Adapter Build Order

18. Secrets Handling

19. Summary

Blue48 Worker Mesh Architecture

1. Purpose

2. High-Level Flow

3. Worker Lines

4. Core Worker Set

5. Granular Worker Breakdown

5.1 Scoutline

5.2 Proofline

5.3 Mapline

5.4 Classifyline

5.5 Sealine

5.6 Routeline

5.7 Ledgerline

5.8 Publishline

6. Which Workers Need Models?

7. Human Review Boundaries

8. MVP Worker Build Order

9. Technical Notes

10. Summary

Blue48 IntelMiner and LoRA Training Data Pipeline

1. Purpose

2. What IntelMiner Should Learn From

3. IntelMiner Worker Chain

4. Worker Responsibilities

5. Training Tasks

6. Recommended LoRA Strategy

7. JSONL Training Format

8. Example: IOC Extraction

9. Example: Routing Decision

10. Example: Evidence Handling

11. Dataset Metadata

12. QualityGate Rules

13. Dataset Builder UI Requirements

14. Dataset Versioning

15. Human Review Requirements

16. Summary

Blue48 Operations Cockpit — GUI / UI-UX Concept

1. Purpose

2. Core Control Surfaces

3. Main Navigation

4. Mission Control

Key Widgets

Suggested Layout

5. Case Queue

Recommended Columns

Filters

120 KiB

Raw Blame History