# psyc Python Style Guide
**Established:** 2026-05-14 (Day 2 of the hackathon)
**Status:** Live — all new code must follow this; existing code retrofitted in the same commit.
This guide is the output of a 12-fold style review. It exists so the codebase reads consistently end-to-end and there's never a question of which idiom to use.
---
## 1. Optional values — `Optional[X]`, not `X | None`
```python
from typing import Optional
def get_case(case_id: str) -> Optional[Case]:
...
def find_actor(
name: str,
country: Optional[str] = None,
) -> Actor:
...
```
**Rationale:** explicit, name carries meaning, easy to grep for nullable returns.
---
## 2. Collection generics — `List[X]`, `Dict[K, V]`, not `list[X]`, `dict[K, V]`
```python
from typing import List, Dict
tags: List[str] = []
quotas: Dict[str, int] = {}
def route(case: Case) -> List[Destination]:
...
```
**Rationale:** uppercase typing forms read as type hints, not as runtime constructors. Pair with **rule 1** — both come from `typing`.
---
## 3. Pydantic mutable defaults — `Field(default_factory=...)`, always
```python
from pydantic import BaseModel, Field
class Case(BaseModel):
tags: List[str] = Field(default_factory=list)
routing: Routing = Field(default_factory=Routing)
observables: Dict[str, List[str]] = Field(default_factory=dict)
```
**Rationale:** intent is explicit even though Pydantic deep-copies literal defaults. Same idiom works in `@dataclass`, so we never have to remember which framework we're in.
---
## 4. Function signature wrapping — single line, 120-char limit
```python
def fetch_and_signal(limit: Optional[int] = None, timeout: float = 30.0, user_agent: str = USER_AGENT) -> List[Case]:
...
```
If a signature still exceeds 120 chars after that, *then* hug-parens with trailing comma. Wrapping is the exception, not the default.
```toml
# pyproject.toml
[tool.ruff]
line-length = 120
```
---
## 5. Errors — `Result[T, E]` for expected failures, `raise` for genuinely exceptional ones
A "miss" or "blocked" outcome is data, not an exception. Use the `Result` type in `psyc.result`:
```python
from psyc.result import Result, Ok, Err
def get_case(case_id: str) -> Result[Case, str]:
row = db.fetchone(case_id)
if not row:
return Err(f"case not found: {case_id}")
return Ok(Case.model_validate_json(row))
# caller:
result = get_case(cid)
if isinstance(result, Err):
raise HTTPException(404, result.reason)
case = result.value
```
**Reserve `raise` for:** programmer errors, invariant violations, unrecoverable I/O. Never `raise` from a function whose failure mode is part of normal operation (lookup miss, policy block, rate limit, etc.).
---
## 6. Closed string sets — `class X(str, Enum)`
```python
from enum import Enum
class TLP(str, Enum):
RED = "RED"
AMBER = "AMBER"
GREEN = "GREEN"
CLEAR = "CLEAR"
case.classification.tlp = TLP.AMBER
if case.classification.tlp == TLP.RED:
...
```
Real Enum object — iterable, comparable, has `.value`. JSON-serializable thanks to the `str` mixin. No `Literal` aliases, no bare string constants.
---
## 7. Imports — isort blocks (stdlib / third-party / local)
```python
import csv
from datetime import datetime
from pathlib import Path
from typing import List, Optional
import httpx
import structlog
import typer
from pydantic import BaseModel, Field
from psyc import db
from psyc.models import Case, TLP
from psyc.result import Err, Ok, Result
```
Blocks separated by a single blank line. Within each block: alphabetical, `import x` before `from x import y` is **not** required — pick whatever ruff/isort defaults to.
```toml
[tool.ruff.lint.isort]
known-first-party = ["psyc"]
```
---
## 8. Conditionals — early returns / guard clauses
```python
def classify(case: Case) -> Case:
if not case.observables.urls:
return case
if case.classification.severity:
return case
case.classification.severity = Severity.MEDIUM
return case
```
The happy path stays flat. No "single exit point" dogma. No `match`/`case` unless you're actually dispatching on a tagged union — for if/elif chains, plain `if` is clearer.
---
## 9. Docstrings — module-level only, functions self-documenting
Every `.py` file starts with a one-line docstring describing what it is. Functions rely on naming + type signatures. Add an inline comment only when the **why** is non-obvious.
```python
"""Scoutline — Fetcher + Signalizer for URLhaus."""
from __future__ import annotations
# ... imports ...
def fetch_recent(timeout: float = 30.0) -> str:
...
def parse_urlhaus_csv(text: str) -> Iterable[Dict]:
...
```
**Forbidden:** function docstrings restating the signature (`"""Fetch the recent CSV. Returns: str."""` — useless).
---
## 10. Logging — `structlog` over stdlib `logging`, event names + key/value
```python
import structlog
log = structlog.get_logger(__name__)
log.info("case.ingested", case_id=case.case_id, source="urlhaus", count=len(rows))
log.warning("route.blocked", case_id=case.case_id, dest="VirusTotal", reason="tlp_red")
log.error("submit.failed", case_id=case.case_id, dest="CERT-Bund", error=str(exc))
```
**Event names:** `.` lowercase, dot-separated. **Never** interpolate values into the event name — they go in the key/value payload so the ledger and audit code can index on them.
Configuration lives in `psyc/log.py`, imported once at process start (CLI and cockpit entrypoints).
---
## 11. SQL — SQLAlchemy Core (Tables + `engine.connect()`), no ORM
```python
from sqlalchemy import Table, Column, String, MetaData, create_engine, insert, select
engine = create_engine("sqlite:///data/psyc.db", future=True)
meta = MetaData()
cases = Table(
"cases", meta,
Column("case_id", String, primary_key=True),
Column("summary", String, nullable=False),
Column("tlp", String, nullable=False),
# ...
)
with engine.begin() as conn:
conn.execute(insert(cases).values(case_id=case.case_id, summary=case.summary, tlp=case.classification.tlp.value))
with engine.connect() as conn:
row = conn.execute(select(cases).where(cases.c.case_id == case_id)).fetchone()
```
**No ORM session, no `declarative_base`, no SQLModel.** SQL stays visible as expressions, but parameter binding and dialect handling are SQLAlchemy's job. `engine.begin()` for writes (auto-commit), `engine.connect()` for reads.
---
## 12. CLI — flat Typer commands, hyphenated names
```python
import typer
app = typer.Typer(add_completion=False, help="psyc — defensive CTI routing & sealing")
@app.command("init")
def init() -> None: ...
@app.command("fetch-urlhaus")
def fetch_urlhaus(limit: int = 50) -> None: ...
@app.command("seal-pack")
def seal_pack(case_id: str) -> None: ...
@app.command("route-plan")
def route_plan(case_id: str) -> None: ...
@app.command("serve")
def serve(host: str = "127.0.0.1", port: int = 8000) -> None: ...
```
Invocation: `psyc fetch-urlhaus --limit 50`. No sub-apps, no nested namespaces. Command name = function name with underscores → hyphens.
---
## Tooling
```toml
[tool.ruff]
line-length = 120
[tool.ruff.lint]
select = ["E", "F", "I", "UP", "B"]
# UP rules that conflict with rule 1/2 are ignored:
ignore = ["UP006", "UP007", "UP035"]
[tool.ruff.lint.isort]
known-first-party = ["psyc"]
```
`UP006` / `UP007` / `UP035` would auto-rewrite `List[X]`/`Optional[X]` to lowercase / pipe forms — disabled because **rules 1 and 2** outrank them.
---
## Out of scope
This guide intentionally **does not** cover:
- async vs sync (decide per worker line; httpx clients sync inside Typer commands, async only if the cockpit demands it)
- test framework (pytest is the default; no folding required)
- exception class hierarchies (rule 5 minimizes their need; design per line as it arrives)
- API response shapes for the cockpit (REST/HTML; JSON-only routes when stage 3+ ships)
Add new folds at the bottom of this file as they come up. Don't retrofit the guide silently — each addition is a recorded decision.