Files

m17hr1l e04c6c96d8 init: scaffold psyc — defensive CTI routing & evidence-sealing platform

Stage-1 vertical slice: Pydantic Case model, SQLAlchemy Core persistence,
URLhaus Scoutline fetcher, FastAPI/Jinja cockpit (cases list + detail),
flat Typer CLI, Result[T, E] type module, structlog config.
Architecture in docs/dossier.md; 12-fold style guide in docs/style.md.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

2026-05-14 12:43:47 +02:00

12 KiB

Raw Permalink Blame History

Blue48 Worker Mesh Architecture

Document type: Project record / technical architecture
Scope: Worker names, responsibilities, interfaces, data flow, human review boundaries
Status: Draft v1

1. Purpose

Blue48 should not rely on one large, expensive, opaque model to perform all cyber-intelligence operations. The platform should be built as a mesh of small, specialized workers.

Each worker performs one narrow function, writes structured output, and passes a normalized case object to the next stage. Heavy models are reserved for judgment-heavy tasks such as confidence scoring, routing explanations, public report drafting, and training-example generation.

Core principle:

Small workers produce traceable outputs. Humans approve sensitive decisions. The Ledger proves what happened.

2. High-Level Flow

Scoutline
→ Proofline
→ Mapline
→ Classifyline
→ Sealine
→ Routeline
→ Ledgerline
→ Publishline
→ Trainline

Operator version:

Detect → Validate → Map → Classify → Seal Evidence → Route → Submit → Track → Archive → Learn

3. Worker Lines

Line	Purpose
Scoutline	Finds, fetches, parses, and deduplicates lawful intelligence sources.
Proofline	Validates claims, checks indicators, measures freshness, and scores confidence.
Mapline	Resolves victims, actors, sectors, jurisdictions, CERT routes, and affected products.
Classifyline	Assigns severity, TLP, incident type, and operational class.
Sealine	Packages evidence, encrypts it for authorized recipients, and destroys local plaintext/key material when policy allows.
Routeline	Selects destinations, builds payloads, enforces destination policy, and submits reports.
Ledgerline	Records immutable audit events, receipts, outcomes, and follow-up status.
Publishline	Produces sanitized public intelligence only after mitigation and approval.
Trainline	Converts lawful, reviewed intelligence into LoRA-ready training data.

4. Core Worker Set

The first conceptual worker set is:

Scout → Verifier → Mapper → Classifier → Sealer → Router → Courier → Ledger

Support workers:

Watcher → Archivist → Publisher

Operational sentence:

Scout detects.
Verifier confirms.
Mapper identifies.
Classifier prioritizes.
Sealer protects.
Router decides.
Courier submits.
Ledger proves.
Watcher follows up.
Archivist forgets safely.
Publisher informs.

5. Granular Worker Breakdown

5.1 Scoutline

Worker	Job	Model requirement
SourcePlanner	Maintains the approved source list, collection schedules, and source eligibility.	None / rules
Crawler	Discovers new pages, feeds, advisories, reports, APIs, and datasets.	None
Fetcher	Downloads pages, PDFs, JSON, RSS, STIX/TAXII, MISP events, and API responses.	None
Parser	Extracts title, date, author, body, tables, indicators, and metadata.	Rules / small model
Deduper	Detects duplicate reports, reposted IOCs, syndicated articles, and repeated claims.	Embeddings / rules
SourceRanker	Scores the source based on trust, history, origin, and license status.	Rules / small model
Signalizer	Converts parsed content into candidate intelligence signals.	Small/medium model

Output:

{
  "signal_id": "uuid",
  "source_type": "advisory | cti_report | abuse_feed | ransomware_monitor | public_blog | misp_event",
  "summary": "short defensive summary",
  "observed_at": "2026-05-13T00:00:00Z",
  "raw_evidence_location": "internal-only-reference"
}

5.2 Proofline

Worker	Job
Correlator	Checks whether the same signal appears across multiple independent sources.
IOCChecker	Validates domains, IPs, hashes, URLs, wallet addresses, emails, and CVEs.
FreshnessChecker	Determines whether the signal is current, stale, repeated, or resurfaced.
ClaimChecker	Labels language as confirmed, claimed, observed, rumored, or speculative.
ConfidenceScorer	Produces final confidence and optional Admiralty Code values.

Output:

{
  "confidence": "low | medium | high",
  "source_reliability": "A | B | C | D | E | F | unknown",
  "information_credibility": "1 | 2 | 3 | 4 | 5 | 6 | unknown",
  "claim_status": "confirmed | claimed | observed | rumored | speculative",
  "freshness": "new | recent | stale | resurfaced"
}

5.3 Mapline

Worker	Job
EntityResolver	Maps organization names, domains, subsidiaries, brands, and aliases.
GeoResolver	Maps victim country, jurisdiction, national CERT, and cross-border implications.
SectorMapper	Maps victim sector and critical-infrastructure status.
ActorMapper	Maps actor names, aliases, ransomware brands, campaigns, and confidence.
CVEResolver	Maps vulnerabilities to CVEs, affected products, KEV status, and exploit relevance.

Output:

{
  "victim": {
    "name": "",
    "domain": "",
    "country": "",
    "sector": "",
    "critical_infrastructure": false
  },
  "actor": {
    "name": "",
    "aliases": [],
    "campaign": "",
    "confidence": "low | medium | high"
  },
  "jurisdiction": {
    "primary_cert": "",
    "law_enforcement_route": "",
    "sector_isac": ""
  }
}

5.4 Classifyline

Worker	Job
Classifier	Assigns incident type, severity, internal class, and response SLA.
TLPGuard	Ensures TLP data cannot be routed to destinations that cannot receive it.
DestinationPolicyGuard	Blocks inappropriate, illegal, excessive, or sensitive submissions.

Internal class mapping:

Internal class	Meaning	External severity
A	Imminent harm or attack likely underway	Critical
B	Credible planned attack	High
C	Confirmed exposure	High / Medium
D	Campaign intelligence	Medium / High
E	Weak signal or watchlist item	Low / Monitor

Output:

{
  "class": "A | B | C | D | E",
  "severity": "low | medium | high | critical",
  "tlp": "RED | AMBER | GREEN | CLEAR",
  "incident_type": "ransomware | credential_leak | access_sale | phishing | malware | exploit | botnet | data_leak",
  "policy_blocks": []
}

5.5 Sealine

Sealine replaces the old primary concept of “sanitization.” The objective is not to destroy useful evidence, but to protect it.

Worker	Job
EvidencePackager	Collects sensitive evidence, hashes it, and packages it with metadata.
Sealer	Encrypts evidence for authorized recipients using public-key or hybrid encryption.
KeyBurner	Destroys local unwrapped evidence keys after successful sealing.
RetentionGuard	Enforces retention, deletion, plaintext destruction, and crypto-erasure policy.

Sealine principle:

Preserve the truth. Seal the sensitive evidence. Route only what each recipient is authorized to receive.

Output:

{
  "sealed_evidence": {
    "package_id": "uuid",
    "encryption": "age | PGP | CMS | hybrid",
    "recipient_keys": [
      {
        "recipient": "CERT-Bund",
        "key_id": "authority-key-id",
        "wrapped_key": "encrypted-evidence-key"
      }
    ],
    "payload_hash": "sha256",
    "plaintext_destroyed": true,
    "local_unwrapped_key_destroyed": true
  }
}

5.6 Routeline

Worker	Job
RoutePlanner	Chooses destination order based on victim, country, sector, severity, TLP, and evidence type.
PayloadBuilder	Builds destination-specific payloads: sealed package, STIX bundle, MISP event, abuse report, or public-safe extract.
Redactor	Minimizes public/semi-public outputs only. Redactor does not replace Sealer.
Courier	Submits through API, portal, structured email, or secure upload.
RateLimiter	Enforces destination quotas, retries, and backoff.
ReceiptCollector	Captures case IDs, acknowledgements, API responses, and status URLs.

Example route object:

{
  "routes": [
    {
      "destination": "CERT-Bund",
      "type": "authority",
      "payload": "sealed_evidence_package",
      "priority": 1,
      "max_tlp_allowed": "RED"
    },
    {
      "destination": "MISP trusted community",
      "type": "cti_sharing",
      "payload": "stix_indicators",
      "priority": 2,
      "max_tlp_allowed": "AMBER"
    },
    {
      "destination": "Cloudflare Abuse API",
      "type": "provider_abuse",
      "payload": "minimized_abuse_report",
      "priority": 3,
      "max_tlp_allowed": "CLEAR"
    }
  ]
}

5.7 Ledgerline

Worker	Job
Ledger	Creates immutable audit records for all external submissions and destructive actions.
Watcher	Polls outcomes: takedown status, MISP sightings, CERT acknowledgement, provider response.
Archivist	Handles retention, sealed package lifecycle, legal holds, and crypto-erasure confirmation.

Ledger record:

{
  "timestamp": "2026-05-13T00:00:00Z",
  "case_id": "B48-2026-000001",
  "destination": "CERT-Bund",
  "payload_hash": "sha256",
  "submitter_identity": "blue48-official-handle",
  "tlp": "AMBER",
  "response_id": "external-case-id",
  "outcome": "submitted | acknowledged | rejected | actioned"
}

5.8 Publishline

Worker	Job
Publisher	Produces public-safe intelligence reports after mitigation and approval.

Publisher may include:

sector trend
actor trend
CVEs
TTPs
defensive recommendations
sanitized IOCs
non-sensitive timelines

Publisher must not include:

raw credentials
stolen data
victim secrets
live access details
exact criminal-source links
unmitigated exploit paths

6. Which Workers Need Models?

Worker	Model need
SourcePlanner	None / rules
Crawler / Fetcher	None
Parser	Rules / small model
Deduper	Embeddings / rules
Signalizer	Small or medium model
ClaimChecker	Small or medium model
ConfidenceScorer	Medium model
EntityResolver	Rules + embeddings
ActorMapper	Small or medium model
Classifier	Small or medium model
RoutePlanner	Rules first, model second
PayloadBuilder	Small model
Publisher	Medium or large model
ExampleBuilder	Medium model
QualityGate	Medium model + rules

Heavy models should be reserved for:

ConfidenceScorer
Classifier
Publisher
ExampleBuilder
QualityGate

7. Human Review Boundaries

Human approval is required before:

sending sealed evidence to any external destination
contacting law enforcement or CERTs with sensitive evidence
publishing a public advisory
destroying plaintext evidence
destroying local unwrapped evidence keys
exporting a training dataset
modifying routing policy
modifying recipient keys

Two-person control should be required for:

sending TLP:RED or highly sensitive packages
deleting evidence
changing authority recipient keys
publishing named-victim reports
exporting training data based on internal cases

8. MVP Worker Build Order

Initial worker implementation priority:

SourcePlanner
Fetcher
Parser
Deduper
Signalizer
IOCChecker
EntityResolver
GeoResolver
Classifier
EvidencePackager
Sealer
RoutePlanner
Courier
Ledger
ReceiptCollector
IntelMiner

Minimum operational chain:

Fetcher → Parser → Signalizer → IOCChecker → EntityResolver → Classifier → Sealer → RoutePlanner → Courier → Ledger

9. Technical Notes

Recommended implementation style:

Component	Recommendation
Worker runtime	Python services, Celery, Temporal, Prefect, or lightweight queue workers
Message format	JSON normalized case object
Interop format	STIX 2.1 where useful
Storage	PostgreSQL + object storage
Search	OpenSearch or Meilisearch
CTI graph	OpenCTI or MISP integration
Audit	append-only ledger table
Secrets	`.env`, secret manager, runtime injection only
UI	Blue48 Operations Cockpit

10. Summary

Blue48 should operate as a worker mesh, not a monolithic AI agent.

The system should use small deterministic workers where possible, small models where useful, and larger models only for judgment-heavy steps. Sensitive evidence is handled by Sealine, not casually rendered or distributed. Routing and public reporting are controlled by policy guards, human review, and immutable audit logging.

12 KiB Raw Permalink Blame History

Blue48 Worker Mesh Architecture

1. Purpose

2. High-Level Flow

3. Worker Lines

4. Core Worker Set

5. Granular Worker Breakdown

5.1 Scoutline

5.2 Proofline

5.3 Mapline

5.4 Classifyline

5.5 Sealine

5.6 Routeline

5.7 Ledgerline

5.8 Publishline

6. Which Workers Need Models?

7. Human Review Boundaries

8. MVP Worker Build Order

9. Technical Notes

10. Summary

12 KiB

Raw Permalink Blame History