Increase platform abstraction cohesion
This commit is contained in:
95
artifacts/plans/17-person-enrichment-without-llm.md
Normal file
95
artifacts/plans/17-person-enrichment-without-llm.md
Normal file
@@ -0,0 +1,95 @@
|
||||
# Feature Plan: Person Model Enrichment (Non-LLM First)
|
||||
|
||||
## Goal
|
||||
Populate `Person` fields from existing message history without spending OpenAI tokens by default:
|
||||
- `summary`
|
||||
- `profile`
|
||||
- `revealed`
|
||||
- `likes`
|
||||
- `dislikes`
|
||||
- `sentiment`
|
||||
- `timezone`
|
||||
- `last_interaction`
|
||||
|
||||
## Problem We Are Solving
|
||||
- We have high-volume message data but limited durable person intelligence.
|
||||
- LLM analysis is expensive for continuous/background processing.
|
||||
- We need fast, deterministic extraction first, with optional semantic ranking.
|
||||
|
||||
## Design Decisions
|
||||
1. Config scope:
|
||||
- global defaults
|
||||
- optional group-level overrides
|
||||
- per-user overrides
|
||||
2. Resolution order:
|
||||
- `user > group > global`
|
||||
3. Global toggle:
|
||||
- hard kill-switch (`PERSON_ENRICHMENT_ENABLED`)
|
||||
4. Per-user/group controls:
|
||||
- enable/disable enrichment
|
||||
- write mode (`proposal_required` or `direct`)
|
||||
- confidence threshold
|
||||
- max messages scanned per run
|
||||
- semantic-ranking toggle
|
||||
|
||||
## Proposed Data Additions
|
||||
- `PersonEnrichmentSettings`:
|
||||
- scope fields (`user`, optional `group`)
|
||||
- toggle/threshold/runtime limits
|
||||
- `PersonSignal`:
|
||||
- normalized extracted clue
|
||||
- source references (message ids/events)
|
||||
- confidence and detector name
|
||||
- `PersonUpdateProposal`:
|
||||
- pending/approved/rejected person field updates
|
||||
- reason and provenance
|
||||
- Optional `PersonFieldRevision`:
|
||||
- before/after snapshots for auditability
|
||||
|
||||
## Processing Flow
|
||||
1. Select message window:
|
||||
- recent inbound/outbound messages per person/service
|
||||
- bounded by configurable caps
|
||||
2. Fast extraction:
|
||||
- deterministic rules/regex for:
|
||||
- timezone cues
|
||||
- explicit likes/dislikes
|
||||
- self-revealed facts
|
||||
- interaction-derived sentiment hints
|
||||
3. Semantic ranking (optional):
|
||||
- use Manticore-backed similarity search for classifier labels
|
||||
- rank candidate signals; do not call OpenAI in default path
|
||||
4. Signal aggregation:
|
||||
- merge repeated evidence
|
||||
- decay stale evidence
|
||||
- detect contradictions
|
||||
5. Apply update:
|
||||
- `proposal_required`: create `PersonUpdateProposal`
|
||||
- `direct`: write only above confidence threshold and with no conflict
|
||||
6. Persist audit trail:
|
||||
- record detector/classifier source and exact message provenance
|
||||
|
||||
## Field-Specific Policy
|
||||
- `summary/profile`: generated from stable high-confidence aggregates only.
|
||||
- `revealed`: only explicit self-disclosures.
|
||||
- `likes/dislikes`: require explicit statement or repeated pattern.
|
||||
- `sentiment`: rolling value with recency decay; never absolute truth label.
|
||||
- `timezone`: explicit declaration preferred; behavioral inference secondary.
|
||||
- `last_interaction`: deterministic from most recent message timestamps.
|
||||
|
||||
## Rollout
|
||||
1. Schema and settings models.
|
||||
2. Deterministic extractor pipeline and commands.
|
||||
3. Proposal queue + review flow.
|
||||
4. Optional Manticore semantic ranking layer.
|
||||
5. Backfill job for existing persons with safe rate limits.
|
||||
|
||||
## Acceptance Criteria
|
||||
- Default enrichment path runs with zero OpenAI usage.
|
||||
- Person updates are traceable to concrete message evidence.
|
||||
- Config hierarchy behaves predictably (`user > group > global`).
|
||||
- Operators can switch between proposal and direct write modes per scope.
|
||||
|
||||
## Out of Scope
|
||||
- Cross-user shared person graph.
|
||||
- Autonomous LLM-generated profile writing as default.
|
||||
Reference in New Issue
Block a user