init: scaffold psyc — defensive CTI routing & evidence-sealing platform
Stage-1 vertical slice: Pydantic Case model, SQLAlchemy Core persistence, URLhaus Scoutline fetcher, FastAPI/Jinja cockpit (cases list + detail), flat Typer CLI, Result[T, E] type module, structlog config. Architecture in docs/dossier.md; 12-fold style guide in docs/style.md. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
451
docs/archive/hivemap.md
Normal file
451
docs/archive/hivemap.md
Normal file
@@ -0,0 +1,451 @@
|
||||
# Blue48 Worker Mesh Architecture
|
||||
|
||||
**Document type:** Project record / technical architecture
|
||||
**Scope:** Worker names, responsibilities, interfaces, data flow, human review boundaries
|
||||
**Status:** Draft v1
|
||||
|
||||
---
|
||||
|
||||
## 1. Purpose
|
||||
|
||||
Blue48 should not rely on one large, expensive, opaque model to perform all cyber-intelligence operations. The platform should be built as a mesh of small, specialized workers.
|
||||
|
||||
Each worker performs one narrow function, writes structured output, and passes a normalized case object to the next stage. Heavy models are reserved for judgment-heavy tasks such as confidence scoring, routing explanations, public report drafting, and training-example generation.
|
||||
|
||||
Core principle:
|
||||
|
||||
> Small workers produce traceable outputs. Humans approve sensitive decisions. The Ledger proves what happened.
|
||||
|
||||
---
|
||||
|
||||
## 2. High-Level Flow
|
||||
|
||||
```text
|
||||
Scoutline
|
||||
→ Proofline
|
||||
→ Mapline
|
||||
→ Classifyline
|
||||
→ Sealine
|
||||
→ Routeline
|
||||
→ Ledgerline
|
||||
→ Publishline
|
||||
→ Trainline
|
||||
```
|
||||
|
||||
Operator version:
|
||||
|
||||
```text
|
||||
Detect → Validate → Map → Classify → Seal Evidence → Route → Submit → Track → Archive → Learn
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. Worker Lines
|
||||
|
||||
| Line | Purpose |
|
||||
|---|---|
|
||||
| **Scoutline** | Finds, fetches, parses, and deduplicates lawful intelligence sources. |
|
||||
| **Proofline** | Validates claims, checks indicators, measures freshness, and scores confidence. |
|
||||
| **Mapline** | Resolves victims, actors, sectors, jurisdictions, CERT routes, and affected products. |
|
||||
| **Classifyline** | Assigns severity, TLP, incident type, and operational class. |
|
||||
| **Sealine** | Packages evidence, encrypts it for authorized recipients, and destroys local plaintext/key material when policy allows. |
|
||||
| **Routeline** | Selects destinations, builds payloads, enforces destination policy, and submits reports. |
|
||||
| **Ledgerline** | Records immutable audit events, receipts, outcomes, and follow-up status. |
|
||||
| **Publishline** | Produces sanitized public intelligence only after mitigation and approval. |
|
||||
| **Trainline** | Converts lawful, reviewed intelligence into LoRA-ready training data. |
|
||||
|
||||
---
|
||||
|
||||
## 4. Core Worker Set
|
||||
|
||||
The first conceptual worker set is:
|
||||
|
||||
```text
|
||||
Scout → Verifier → Mapper → Classifier → Sealer → Router → Courier → Ledger
|
||||
```
|
||||
|
||||
Support workers:
|
||||
|
||||
```text
|
||||
Watcher → Archivist → Publisher
|
||||
```
|
||||
|
||||
Operational sentence:
|
||||
|
||||
```text
|
||||
Scout detects.
|
||||
Verifier confirms.
|
||||
Mapper identifies.
|
||||
Classifier prioritizes.
|
||||
Sealer protects.
|
||||
Router decides.
|
||||
Courier submits.
|
||||
Ledger proves.
|
||||
Watcher follows up.
|
||||
Archivist forgets safely.
|
||||
Publisher informs.
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. Granular Worker Breakdown
|
||||
|
||||
### 5.1 Scoutline
|
||||
|
||||
| Worker | Job | Model requirement |
|
||||
|---|---|---|
|
||||
| **SourcePlanner** | Maintains the approved source list, collection schedules, and source eligibility. | None / rules |
|
||||
| **Crawler** | Discovers new pages, feeds, advisories, reports, APIs, and datasets. | None |
|
||||
| **Fetcher** | Downloads pages, PDFs, JSON, RSS, STIX/TAXII, MISP events, and API responses. | None |
|
||||
| **Parser** | Extracts title, date, author, body, tables, indicators, and metadata. | Rules / small model |
|
||||
| **Deduper** | Detects duplicate reports, reposted IOCs, syndicated articles, and repeated claims. | Embeddings / rules |
|
||||
| **SourceRanker** | Scores the source based on trust, history, origin, and license status. | Rules / small model |
|
||||
| **Signalizer** | Converts parsed content into candidate intelligence signals. | Small/medium model |
|
||||
|
||||
Output:
|
||||
|
||||
```json
|
||||
{
|
||||
"signal_id": "uuid",
|
||||
"source_type": "advisory | cti_report | abuse_feed | ransomware_monitor | public_blog | misp_event",
|
||||
"summary": "short defensive summary",
|
||||
"observed_at": "2026-05-13T00:00:00Z",
|
||||
"raw_evidence_location": "internal-only-reference"
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 5.2 Proofline
|
||||
|
||||
| Worker | Job |
|
||||
|---|---|
|
||||
| **Correlator** | Checks whether the same signal appears across multiple independent sources. |
|
||||
| **IOCChecker** | Validates domains, IPs, hashes, URLs, wallet addresses, emails, and CVEs. |
|
||||
| **FreshnessChecker** | Determines whether the signal is current, stale, repeated, or resurfaced. |
|
||||
| **ClaimChecker** | Labels language as confirmed, claimed, observed, rumored, or speculative. |
|
||||
| **ConfidenceScorer** | Produces final confidence and optional Admiralty Code values. |
|
||||
|
||||
Output:
|
||||
|
||||
```json
|
||||
{
|
||||
"confidence": "low | medium | high",
|
||||
"source_reliability": "A | B | C | D | E | F | unknown",
|
||||
"information_credibility": "1 | 2 | 3 | 4 | 5 | 6 | unknown",
|
||||
"claim_status": "confirmed | claimed | observed | rumored | speculative",
|
||||
"freshness": "new | recent | stale | resurfaced"
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 5.3 Mapline
|
||||
|
||||
| Worker | Job |
|
||||
|---|---|
|
||||
| **EntityResolver** | Maps organization names, domains, subsidiaries, brands, and aliases. |
|
||||
| **GeoResolver** | Maps victim country, jurisdiction, national CERT, and cross-border implications. |
|
||||
| **SectorMapper** | Maps victim sector and critical-infrastructure status. |
|
||||
| **ActorMapper** | Maps actor names, aliases, ransomware brands, campaigns, and confidence. |
|
||||
| **CVEResolver** | Maps vulnerabilities to CVEs, affected products, KEV status, and exploit relevance. |
|
||||
|
||||
Output:
|
||||
|
||||
```json
|
||||
{
|
||||
"victim": {
|
||||
"name": "",
|
||||
"domain": "",
|
||||
"country": "",
|
||||
"sector": "",
|
||||
"critical_infrastructure": false
|
||||
},
|
||||
"actor": {
|
||||
"name": "",
|
||||
"aliases": [],
|
||||
"campaign": "",
|
||||
"confidence": "low | medium | high"
|
||||
},
|
||||
"jurisdiction": {
|
||||
"primary_cert": "",
|
||||
"law_enforcement_route": "",
|
||||
"sector_isac": ""
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 5.4 Classifyline
|
||||
|
||||
| Worker | Job |
|
||||
|---|---|
|
||||
| **Classifier** | Assigns incident type, severity, internal class, and response SLA. |
|
||||
| **TLPGuard** | Ensures TLP data cannot be routed to destinations that cannot receive it. |
|
||||
| **DestinationPolicyGuard** | Blocks inappropriate, illegal, excessive, or sensitive submissions. |
|
||||
|
||||
Internal class mapping:
|
||||
|
||||
| Internal class | Meaning | External severity |
|
||||
|---|---|---|
|
||||
| **A** | Imminent harm or attack likely underway | Critical |
|
||||
| **B** | Credible planned attack | High |
|
||||
| **C** | Confirmed exposure | High / Medium |
|
||||
| **D** | Campaign intelligence | Medium / High |
|
||||
| **E** | Weak signal or watchlist item | Low / Monitor |
|
||||
|
||||
Output:
|
||||
|
||||
```json
|
||||
{
|
||||
"class": "A | B | C | D | E",
|
||||
"severity": "low | medium | high | critical",
|
||||
"tlp": "RED | AMBER | GREEN | CLEAR",
|
||||
"incident_type": "ransomware | credential_leak | access_sale | phishing | malware | exploit | botnet | data_leak",
|
||||
"policy_blocks": []
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 5.5 Sealine
|
||||
|
||||
Sealine replaces the old primary concept of “sanitization.” The objective is not to destroy useful evidence, but to protect it.
|
||||
|
||||
| Worker | Job |
|
||||
|---|---|
|
||||
| **EvidencePackager** | Collects sensitive evidence, hashes it, and packages it with metadata. |
|
||||
| **Sealer** | Encrypts evidence for authorized recipients using public-key or hybrid encryption. |
|
||||
| **KeyBurner** | Destroys local unwrapped evidence keys after successful sealing. |
|
||||
| **RetentionGuard** | Enforces retention, deletion, plaintext destruction, and crypto-erasure policy. |
|
||||
|
||||
Sealine principle:
|
||||
|
||||
> Preserve the truth. Seal the sensitive evidence. Route only what each recipient is authorized to receive.
|
||||
|
||||
Output:
|
||||
|
||||
```json
|
||||
{
|
||||
"sealed_evidence": {
|
||||
"package_id": "uuid",
|
||||
"encryption": "age | PGP | CMS | hybrid",
|
||||
"recipient_keys": [
|
||||
{
|
||||
"recipient": "CERT-Bund",
|
||||
"key_id": "authority-key-id",
|
||||
"wrapped_key": "encrypted-evidence-key"
|
||||
}
|
||||
],
|
||||
"payload_hash": "sha256",
|
||||
"plaintext_destroyed": true,
|
||||
"local_unwrapped_key_destroyed": true
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 5.6 Routeline
|
||||
|
||||
| Worker | Job |
|
||||
|---|---|
|
||||
| **RoutePlanner** | Chooses destination order based on victim, country, sector, severity, TLP, and evidence type. |
|
||||
| **PayloadBuilder** | Builds destination-specific payloads: sealed package, STIX bundle, MISP event, abuse report, or public-safe extract. |
|
||||
| **Redactor** | Minimizes public/semi-public outputs only. Redactor does not replace Sealer. |
|
||||
| **Courier** | Submits through API, portal, structured email, or secure upload. |
|
||||
| **RateLimiter** | Enforces destination quotas, retries, and backoff. |
|
||||
| **ReceiptCollector** | Captures case IDs, acknowledgements, API responses, and status URLs. |
|
||||
|
||||
Example route object:
|
||||
|
||||
```json
|
||||
{
|
||||
"routes": [
|
||||
{
|
||||
"destination": "CERT-Bund",
|
||||
"type": "authority",
|
||||
"payload": "sealed_evidence_package",
|
||||
"priority": 1,
|
||||
"max_tlp_allowed": "RED"
|
||||
},
|
||||
{
|
||||
"destination": "MISP trusted community",
|
||||
"type": "cti_sharing",
|
||||
"payload": "stix_indicators",
|
||||
"priority": 2,
|
||||
"max_tlp_allowed": "AMBER"
|
||||
},
|
||||
{
|
||||
"destination": "Cloudflare Abuse API",
|
||||
"type": "provider_abuse",
|
||||
"payload": "minimized_abuse_report",
|
||||
"priority": 3,
|
||||
"max_tlp_allowed": "CLEAR"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 5.7 Ledgerline
|
||||
|
||||
| Worker | Job |
|
||||
|---|---|
|
||||
| **Ledger** | Creates immutable audit records for all external submissions and destructive actions. |
|
||||
| **Watcher** | Polls outcomes: takedown status, MISP sightings, CERT acknowledgement, provider response. |
|
||||
| **Archivist** | Handles retention, sealed package lifecycle, legal holds, and crypto-erasure confirmation. |
|
||||
|
||||
Ledger record:
|
||||
|
||||
```json
|
||||
{
|
||||
"timestamp": "2026-05-13T00:00:00Z",
|
||||
"case_id": "B48-2026-000001",
|
||||
"destination": "CERT-Bund",
|
||||
"payload_hash": "sha256",
|
||||
"submitter_identity": "blue48-official-handle",
|
||||
"tlp": "AMBER",
|
||||
"response_id": "external-case-id",
|
||||
"outcome": "submitted | acknowledged | rejected | actioned"
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 5.8 Publishline
|
||||
|
||||
| Worker | Job |
|
||||
|---|---|
|
||||
| **Publisher** | Produces public-safe intelligence reports after mitigation and approval. |
|
||||
|
||||
Publisher may include:
|
||||
|
||||
- sector trend
|
||||
- actor trend
|
||||
- CVEs
|
||||
- TTPs
|
||||
- defensive recommendations
|
||||
- sanitized IOCs
|
||||
- non-sensitive timelines
|
||||
|
||||
Publisher must not include:
|
||||
|
||||
- raw credentials
|
||||
- stolen data
|
||||
- victim secrets
|
||||
- live access details
|
||||
- exact criminal-source links
|
||||
- unmitigated exploit paths
|
||||
|
||||
---
|
||||
|
||||
## 6. Which Workers Need Models?
|
||||
|
||||
| Worker | Model need |
|
||||
|---|---|
|
||||
| SourcePlanner | None / rules |
|
||||
| Crawler / Fetcher | None |
|
||||
| Parser | Rules / small model |
|
||||
| Deduper | Embeddings / rules |
|
||||
| Signalizer | Small or medium model |
|
||||
| ClaimChecker | Small or medium model |
|
||||
| ConfidenceScorer | Medium model |
|
||||
| EntityResolver | Rules + embeddings |
|
||||
| ActorMapper | Small or medium model |
|
||||
| Classifier | Small or medium model |
|
||||
| RoutePlanner | Rules first, model second |
|
||||
| PayloadBuilder | Small model |
|
||||
| Publisher | Medium or large model |
|
||||
| ExampleBuilder | Medium model |
|
||||
| QualityGate | Medium model + rules |
|
||||
|
||||
Heavy models should be reserved for:
|
||||
|
||||
```text
|
||||
ConfidenceScorer
|
||||
Classifier
|
||||
Publisher
|
||||
ExampleBuilder
|
||||
QualityGate
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 7. Human Review Boundaries
|
||||
|
||||
Human approval is required before:
|
||||
|
||||
- sending sealed evidence to any external destination
|
||||
- contacting law enforcement or CERTs with sensitive evidence
|
||||
- publishing a public advisory
|
||||
- destroying plaintext evidence
|
||||
- destroying local unwrapped evidence keys
|
||||
- exporting a training dataset
|
||||
- modifying routing policy
|
||||
- modifying recipient keys
|
||||
|
||||
Two-person control should be required for:
|
||||
|
||||
- sending TLP:RED or highly sensitive packages
|
||||
- deleting evidence
|
||||
- changing authority recipient keys
|
||||
- publishing named-victim reports
|
||||
- exporting training data based on internal cases
|
||||
|
||||
---
|
||||
|
||||
## 8. MVP Worker Build Order
|
||||
|
||||
Initial worker implementation priority:
|
||||
|
||||
1. SourcePlanner
|
||||
2. Fetcher
|
||||
3. Parser
|
||||
4. Deduper
|
||||
5. Signalizer
|
||||
6. IOCChecker
|
||||
7. EntityResolver
|
||||
8. GeoResolver
|
||||
9. Classifier
|
||||
10. EvidencePackager
|
||||
11. Sealer
|
||||
12. RoutePlanner
|
||||
13. Courier
|
||||
14. Ledger
|
||||
15. ReceiptCollector
|
||||
16. IntelMiner
|
||||
|
||||
Minimum operational chain:
|
||||
|
||||
```text
|
||||
Fetcher → Parser → Signalizer → IOCChecker → EntityResolver → Classifier → Sealer → RoutePlanner → Courier → Ledger
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 9. Technical Notes
|
||||
|
||||
Recommended implementation style:
|
||||
|
||||
| Component | Recommendation |
|
||||
|---|---|
|
||||
| Worker runtime | Python services, Celery, Temporal, Prefect, or lightweight queue workers |
|
||||
| Message format | JSON normalized case object |
|
||||
| Interop format | STIX 2.1 where useful |
|
||||
| Storage | PostgreSQL + object storage |
|
||||
| Search | OpenSearch or Meilisearch |
|
||||
| CTI graph | OpenCTI or MISP integration |
|
||||
| Audit | append-only ledger table |
|
||||
| Secrets | `.env`, secret manager, runtime injection only |
|
||||
| UI | Blue48 Operations Cockpit |
|
||||
|
||||
---
|
||||
|
||||
## 10. Summary
|
||||
|
||||
Blue48 should operate as a worker mesh, not a monolithic AI agent.
|
||||
|
||||
The system should use small deterministic workers where possible, small models where useful, and larger models only for judgment-heavy steps. Sensitive evidence is handled by Sealine, not casually rendered or distributed. Routing and public reporting are controlled by policy guards, human review, and immutable audit logging.
|
||||
Reference in New Issue
Block a user