Lightweight containerized prosody tooling + moved auth scripts + xmpp reconnect/auth stabilization
This commit is contained in:
27
artifacts/plans/06-end-to-end-observability.md
Normal file
27
artifacts/plans/06-end-to-end-observability.md
Normal file
@@ -0,0 +1,27 @@
|
||||
# Feature Plan: End-to-End Observability and Traceability
|
||||
|
||||
## Goal
|
||||
Provide trace-level visibility from ingress transport event to UI delivery/ack.
|
||||
|
||||
## Why This Fits GIA
|
||||
- Multi-hop messaging systems require correlation IDs to debug reliably.
|
||||
|
||||
## Scope
|
||||
- Global trace IDs for message lifecycle.
|
||||
- Structured logs and timeline diagnostics view.
|
||||
- Basic metrics and SLA dashboards.
|
||||
|
||||
## Implementation
|
||||
1. Inject `trace_id` at ingress/send initiation.
|
||||
2. Propagate through router, persistence, websocket, command/task flows.
|
||||
3. Standardize structured log schema across services.
|
||||
4. Add timeline diagnostics page by trace ID and session.
|
||||
5. Add core metrics: ingress latency, send latency, drop rate, retry counts.
|
||||
|
||||
## Acceptance Criteria
|
||||
- One trace ID can reconstruct full message path.
|
||||
- At least 95% of critical paths emit structured trace logs.
|
||||
- Operators can isolate bottleneck stage in under 2 minutes.
|
||||
|
||||
## Out of Scope
|
||||
- Full distributed tracing vendor integration.
|
||||
Reference in New Issue
Block a user