psyc/docs/archive/hivemap.md

# Blue48 Worker Mesh Architecture

**Document type:** Project record / technical architecture
**Scope:** Worker names, responsibilities, interfaces, data flow, human review boundaries
**Status:** Draft v1

---

## 1. Purpose

Blue48 should not rely on one large, expensive, opaque model to perform all cyber-intelligence operations. The platform should be built as a mesh of small, specialized workers.

Each worker performs one narrow function, writes structured output, and passes a normalized case object to the next stage. Heavy models are reserved for judgment-heavy tasks such as confidence scoring, routing explanations, public report drafting, and training-example generation.

Core principle:

> Small workers produce traceable outputs. Humans approve sensitive decisions. The Ledger proves what happened.

---

## 2. High-Level Flow

```text
Scoutline
→ Proofline
→ Mapline
→ Classifyline
→ Sealine
→ Routeline
→ Ledgerline
→ Publishline
→ Trainline
```

Operator version:

```text
Detect → Validate → Map → Classify → Seal Evidence → Route → Submit → Track → Archive → Learn
```

---

## 3. Worker Lines

| Line | Purpose |
|---|---|
| **Scoutline** | Finds, fetches, parses, and deduplicates lawful intelligence sources. |
| **Proofline** | Validates claims, checks indicators, measures freshness, and scores confidence. |
| **Mapline** | Resolves victims, actors, sectors, jurisdictions, CERT routes, and affected products. |
| **Classifyline** | Assigns severity, TLP, incident type, and operational class. |
| **Sealine** | Packages evidence, encrypts it for authorized recipients, and destroys local plaintext/key material when policy allows. |
| **Routeline** | Selects destinations, builds payloads, enforces destination policy, and submits reports. |
| **Ledgerline** | Records immutable audit events, receipts, outcomes, and follow-up status. |
| **Publishline** | Produces sanitized public intelligence only after mitigation and approval. |
| **Trainline** | Converts lawful, reviewed intelligence into LoRA-ready training data. |

---

## 4. Core Worker Set

The first conceptual worker set is:

```text
Scout → Verifier → Mapper → Classifier → Sealer → Router → Courier → Ledger
```

Support workers:

```text
Watcher → Archivist → Publisher
```

Operational sentence:

```text
Scout detects.
Verifier confirms.
Mapper identifies.
Classifier prioritizes.
Sealer protects.
Router decides.
Courier submits.
Ledger proves.
Watcher follows up.
Archivist forgets safely.
Publisher informs.
```

---

## 5. Granular Worker Breakdown

### 5.1 Scoutline

| Worker | Job | Model requirement |
|---|---|---|
| **SourcePlanner** | Maintains the approved source list, collection schedules, and source eligibility. | None / rules |
| **Crawler** | Discovers new pages, feeds, advisories, reports, APIs, and datasets. | None |
| **Fetcher** | Downloads pages, PDFs, JSON, RSS, STIX/TAXII, MISP events, and API responses. | None |
| **Parser** | Extracts title, date, author, body, tables, indicators, and metadata. | Rules / small model |
| **Deduper** | Detects duplicate reports, reposted IOCs, syndicated articles, and repeated claims. | Embeddings / rules |
| **SourceRanker** | Scores the source based on trust, history, origin, and license status. | Rules / small model |
| **Signalizer** | Converts parsed content into candidate intelligence signals. | Small/medium model |

Output:

```json
{
  "signal_id": "uuid",
  "source_type": "advisory | cti_report | abuse_feed | ransomware_monitor | public_blog | misp_event",
  "summary": "short defensive summary",
  "observed_at": "2026-05-13T00:00:00Z",
  "raw_evidence_location": "internal-only-reference"
}
```

---

### 5.2 Proofline

| Worker | Job |
|---|---|
| **Correlator** | Checks whether the same signal appears across multiple independent sources. |
| **IOCChecker** | Validates domains, IPs, hashes, URLs, wallet addresses, emails, and CVEs. |
| **FreshnessChecker** | Determines whether the signal is current, stale, repeated, or resurfaced. |
| **ClaimChecker** | Labels language as confirmed, claimed, observed, rumored, or speculative. |
| **ConfidenceScorer** | Produces final confidence and optional Admiralty Code values. |

Output:

```json
{
  "confidence": "low | medium | high",
  "source_reliability": "A | B | C | D | E | F | unknown",
  "information_credibility": "1 | 2 | 3 | 4 | 5 | 6 | unknown",
  "claim_status": "confirmed | claimed | observed | rumored | speculative",
  "freshness": "new | recent | stale | resurfaced"
}
```

---

### 5.3 Mapline

| Worker | Job |
|---|---|
| **EntityResolver** | Maps organization names, domains, subsidiaries, brands, and aliases. |
| **GeoResolver** | Maps victim country, jurisdiction, national CERT, and cross-border implications. |
| **SectorMapper** | Maps victim sector and critical-infrastructure status. |
| **ActorMapper** | Maps actor names, aliases, ransomware brands, campaigns, and confidence. |
| **CVEResolver** | Maps vulnerabilities to CVEs, affected products, KEV status, and exploit relevance. |

Output:

```json
{
  "victim": {
    "name": "",
    "domain": "",
    "country": "",
    "sector": "",
    "critical_infrastructure": false
  },
  "actor": {
    "name": "",
    "aliases": [],
    "campaign": "",
    "confidence": "low | medium | high"
  },
  "jurisdiction": {
    "primary_cert": "",
    "law_enforcement_route": "",
    "sector_isac": ""
  }
}
```

---

### 5.4 Classifyline

| Worker | Job |
|---|---|
| **Classifier** | Assigns incident type, severity, internal class, and response SLA. |
| **TLPGuard** | Ensures TLP data cannot be routed to destinations that cannot receive it. |
| **DestinationPolicyGuard** | Blocks inappropriate, illegal, excessive, or sensitive submissions. |

Internal class mapping:

| Internal class | Meaning | External severity |
|---|---|---|
| **A** | Imminent harm or attack likely underway | Critical |
| **B** | Credible planned attack | High |
| **C** | Confirmed exposure | High / Medium |
| **D** | Campaign intelligence | Medium / High |
| **E** | Weak signal or watchlist item | Low / Monitor |

Output:

```json
{
  "class": "A | B | C | D | E",
  "severity": "low | medium | high | critical",
  "tlp": "RED | AMBER | GREEN | CLEAR",
  "incident_type": "ransomware | credential_leak | access_sale | phishing | malware | exploit | botnet | data_leak",
  "policy_blocks": []
}
```

---

### 5.5 Sealine

Sealine replaces the old primary concept of “sanitization.” The objective is not to destroy useful evidence, but to protect it.

| Worker | Job |
|---|---|
| **EvidencePackager** | Collects sensitive evidence, hashes it, and packages it with metadata. |
| **Sealer** | Encrypts evidence for authorized recipients using public-key or hybrid encryption. |
| **KeyBurner** | Destroys local unwrapped evidence keys after successful sealing. |
| **RetentionGuard** | Enforces retention, deletion, plaintext destruction, and crypto-erasure policy. |

Sealine principle:

> Preserve the truth. Seal the sensitive evidence. Route only what each recipient is authorized to receive.

Output:

```json
{
  "sealed_evidence": {
    "package_id": "uuid",
    "encryption": "age | PGP | CMS | hybrid",
    "recipient_keys": [
      {
        "recipient": "CERT-Bund",
        "key_id": "authority-key-id",
        "wrapped_key": "encrypted-evidence-key"
      }
    ],
    "payload_hash": "sha256",
    "plaintext_destroyed": true,
    "local_unwrapped_key_destroyed": true
  }
}
```

---

### 5.6 Routeline

| Worker | Job |
|---|---|
| **RoutePlanner** | Chooses destination order based on victim, country, sector, severity, TLP, and evidence type. |
| **PayloadBuilder** | Builds destination-specific payloads: sealed package, STIX bundle, MISP event, abuse report, or public-safe extract. |
| **Redactor** | Minimizes public/semi-public outputs only. Redactor does not replace Sealer. |
| **Courier** | Submits through API, portal, structured email, or secure upload. |
| **RateLimiter** | Enforces destination quotas, retries, and backoff. |
| **ReceiptCollector** | Captures case IDs, acknowledgements, API responses, and status URLs. |

Example route object:

```json
{
  "routes": [
    {
      "destination": "CERT-Bund",
      "type": "authority",
      "payload": "sealed_evidence_package",
      "priority": 1,
      "max_tlp_allowed": "RED"
    },
    {
      "destination": "MISP trusted community",
      "type": "cti_sharing",
      "payload": "stix_indicators",
      "priority": 2,
      "max_tlp_allowed": "AMBER"
    },
    {
      "destination": "Cloudflare Abuse API",
      "type": "provider_abuse",
      "payload": "minimized_abuse_report",
      "priority": 3,
      "max_tlp_allowed": "CLEAR"
    }
  ]
}
```

---

### 5.7 Ledgerline

| Worker | Job |
|---|---|
| **Ledger** | Creates immutable audit records for all external submissions and destructive actions. |
| **Watcher** | Polls outcomes: takedown status, MISP sightings, CERT acknowledgement, provider response. |
| **Archivist** | Handles retention, sealed package lifecycle, legal holds, and crypto-erasure confirmation. |

Ledger record:

```json
{
  "timestamp": "2026-05-13T00:00:00Z",
  "case_id": "B48-2026-000001",
  "destination": "CERT-Bund",
  "payload_hash": "sha256",
  "submitter_identity": "blue48-official-handle",
  "tlp": "AMBER",
  "response_id": "external-case-id",
  "outcome": "submitted | acknowledged | rejected | actioned"
}
```

---

### 5.8 Publishline

| Worker | Job |
|---|---|
| **Publisher** | Produces public-safe intelligence reports after mitigation and approval. |

Publisher may include:

- sector trend
- actor trend
- CVEs
- TTPs
- defensive recommendations
- sanitized IOCs
- non-sensitive timelines

Publisher must not include:

- raw credentials
- stolen data
- victim secrets
- live access details
- exact criminal-source links
- unmitigated exploit paths

---

## 6. Which Workers Need Models?

| Worker | Model need |
|---|---|
| SourcePlanner | None / rules |
| Crawler / Fetcher | None |
| Parser | Rules / small model |
| Deduper | Embeddings / rules |
| Signalizer | Small or medium model |
| ClaimChecker | Small or medium model |
| ConfidenceScorer | Medium model |
| EntityResolver | Rules + embeddings |
| ActorMapper | Small or medium model |
| Classifier | Small or medium model |
| RoutePlanner | Rules first, model second |
| PayloadBuilder | Small model |
| Publisher | Medium or large model |
| ExampleBuilder | Medium model |
| QualityGate | Medium model + rules |

Heavy models should be reserved for:

```text
ConfidenceScorer
Classifier
Publisher
ExampleBuilder
QualityGate
```

---

## 7. Human Review Boundaries

Human approval is required before:

- sending sealed evidence to any external destination
- contacting law enforcement or CERTs with sensitive evidence
- publishing a public advisory
- destroying plaintext evidence
- destroying local unwrapped evidence keys
- exporting a training dataset
- modifying routing policy
- modifying recipient keys

Two-person control should be required for:

- sending TLP:RED or highly sensitive packages
- deleting evidence
- changing authority recipient keys
- publishing named-victim reports
- exporting training data based on internal cases

---

## 8. MVP Worker Build Order

Initial worker implementation priority:

1. SourcePlanner
2. Fetcher
3. Parser
4. Deduper
5. Signalizer
6. IOCChecker
7. EntityResolver
8. GeoResolver
9. Classifier
10. EvidencePackager
11. Sealer
12. RoutePlanner
13. Courier
14. Ledger
15. ReceiptCollector
16. IntelMiner

Minimum operational chain:

```text
Fetcher → Parser → Signalizer → IOCChecker → EntityResolver → Classifier → Sealer → RoutePlanner → Courier → Ledger
```

---

## 9. Technical Notes

Recommended implementation style:

| Component | Recommendation |
|---|---|
| Worker runtime | Python services, Celery, Temporal, Prefect, or lightweight queue workers |
| Message format | JSON normalized case object |
| Interop format | STIX 2.1 where useful |
| Storage | PostgreSQL + object storage |
| Search | OpenSearch or Meilisearch |
| CTI graph | OpenCTI or MISP integration |
| Audit | append-only ledger table |
| Secrets | `.env`, secret manager, runtime injection only |
| UI | Blue48 Operations Cockpit |

---

## 10. Summary

Blue48 should operate as a worker mesh, not a monolithic AI agent.

The system should use small deterministic workers where possible, small models where useful, and larger models only for judgment-heavy steps. Sensitive evidence is handled by Sealine, not casually rendered or distributed. Routing and public reporting are controlled by policy guards, human review, and immutable audit logging.