GIA/artifacts/plans/17-person-enrichment-without-llm.md

# Feature Plan: Person Model Enrichment (Non-LLM First)

## Goal
Populate `Person` fields from existing message history without spending OpenAI tokens by default:
- `summary`
- `profile`
- `revealed`
- `likes`
- `dislikes`
- `sentiment`
- `timezone`
- `last_interaction`

## Problem We Are Solving
- We have high-volume message data but limited durable person intelligence.
- LLM analysis is expensive for continuous/background processing.
- We need fast, deterministic extraction first, with optional semantic ranking.

## Design Decisions
1. Config scope:
   - global defaults
   - optional group-level overrides
   - per-user overrides
2. Resolution order:
   - `user > group > global`
3. Global toggle:
   - hard kill-switch (`PERSON_ENRICHMENT_ENABLED`)
4. Per-user/group controls:
   - enable/disable enrichment
   - write mode (`proposal_required` or `direct`)
   - confidence threshold
   - max messages scanned per run
   - semantic-ranking toggle

## Proposed Data Additions
- `PersonEnrichmentSettings`:
  - scope fields (`user`, optional `group`)
  - toggle/threshold/runtime limits
- `PersonSignal`:
  - normalized extracted clue
  - source references (message ids/events)
  - confidence and detector name
- `PersonUpdateProposal`:
  - pending/approved/rejected person field updates
  - reason and provenance
- Optional `PersonFieldRevision`:
  - before/after snapshots for auditability

## Processing Flow
1. Select message window:
   - recent inbound/outbound messages per person/service
   - bounded by configurable caps
2. Fast extraction:
   - deterministic rules/regex for:
     - timezone cues
     - explicit likes/dislikes
     - self-revealed facts
     - interaction-derived sentiment hints
3. Semantic ranking (optional):
   - use Manticore-backed similarity search for classifier labels
   - rank candidate signals; do not call OpenAI in default path
4. Signal aggregation:
   - merge repeated evidence
   - decay stale evidence
   - detect contradictions
5. Apply update:
   - `proposal_required`: create `PersonUpdateProposal`
   - `direct`: write only above confidence threshold and with no conflict
6. Persist audit trail:
   - record detector/classifier source and exact message provenance

## Field-Specific Policy
- `summary/profile`: generated from stable high-confidence aggregates only.
- `revealed`: only explicit self-disclosures.
- `likes/dislikes`: require explicit statement or repeated pattern.
- `sentiment`: rolling value with recency decay; never absolute truth label.
- `timezone`: explicit declaration preferred; behavioral inference secondary.
- `last_interaction`: deterministic from most recent message timestamps.

## Rollout
1. Schema and settings models.
2. Deterministic extractor pipeline and commands.
3. Proposal queue + review flow.
4. Optional Manticore semantic ranking layer.
5. Backfill job for existing persons with safe rate limits.

## Acceptance Criteria
- Default enrichment path runs with zero OpenAI usage.
- Person updates are traceable to concrete message evidence.
- Config hierarchy behaves predictably (`user > group > global`).
- Operators can switch between proposal and direct write modes per scope.

## Out of Scope
- Cross-user shared person graph.
- Autonomous LLM-generated profile writing as default.