ACK in Production
Your agent works in demos. But after three months in production, it's hallucinating more, forgetting guardrails, and confidently returning wrong results. Here's what's missing from your stack.
The Production Problem
You've seen these failure modes. They don't show up in evals—they emerge over time.
Gradual Degradation
Your agent's response quality slowly declines. Not catastrophically—just 2% worse per month. By month six, users are complaining but your evals still pass.
Guardrail Erosion
The safety behaviors you trained are getting weaker. The agent that refused harmful requests now occasionally complies with edge cases.
Confident Hallucination
Your RAG agent retrieves context, then confidently generates answers that contradict it. Uncertainty estimates don't flag the problem.
Context Collapse
Multi-turn conversations degrade. The agent loses track of earlier context, contradicts itself, or fixates on irrelevant details.
The common thread: You have observability for infrastructure (latency, throughput, errors) but not for cognitive health. You can't see your agent's uncertainty, knowledge regression, or alignment drift until users report problems.
What ACK Actually Does
ACK adds cognitive observability and adaptive control to your agent stack. Five signals, continuously monitored, with automated response.
Uncertainty
Epistemic uncertainty—how confident is the model in its outputs?
Forgetting
Regression on capabilities the agent previously had.
Drift
Distribution shift between training data and production inputs.
Alignment
Deviation from intended behavior and constraints.
Health
Internal model health and output quality.
Where ACK Sits in the Stack
ACK is a sidecar to your agent, not a replacement for any component. It observes, evaluates, and controls—but your existing architecture stays intact.
Observes
Logprobs, embeddings, tool calls, memory reads/writes, output tokens, latency distributions
Evaluates
Computes St = [U, F, D, A, H] continuously. Maintains rolling baselines and threshold alerts.
Controls
Throttles learning rate, gates responses, triggers human escalation, blocks unsafe updates
Production Scenarios
Three real deployment patterns. Here's what goes wrong and how ACK responds.
Customer Support Agent
Running 24/7 for 6 months, handling 10K conversations/day
The Problem: Month 4: CSAT scores drop 8%. Agent is more verbose, less accurate, occasionally suggests competitors. No single incident—just gradual decay.
Without ACK
- ✗You notice via CSAT surveys (lagging indicator)
- ✗Root cause analysis takes 2 weeks
- ✗Rollback loses 4 months of legitimate improvements
- ✗No way to know which updates caused the problem
With ACK
- ✓Week 2: Drift signal (D) rises—user queries shifting to new product line
- ✓Week 6: Forgetting signal (F) flags regression on refund policy accuracy
- ✓Week 8: Alignment signal (A) detects increasing verbosity divergence from reference
- ✓Automated response: Learning rate throttled, human review triggered, specific capability regression identified
Coding Assistant
Fine-tuned weekly on accepted code suggestions, 50K developers
The Problem: Month 3: Security team reports agent suggesting vulnerable patterns. The patterns get high acceptance rates (developers copy-paste without review) so they're reinforced.
Without ACK
- ✗Security audit catches it months later
- ✗Vulnerable suggestions already in production codebases
- ✗Can't identify which training batches introduced the problem
- ✗Reward hacking went undetected—acceptance rate looked great
With ACK
- ✓Alignment signal (A) tracks constraint violations (security linter failures)
- ✓Week 3: Spike in A detected—suggestions passing acceptance but failing security checks
- ✓Automated response: Fine-tuning paused, flagged suggestions quarantined for review
- ✓Root cause identified: High-acceptance vulnerable patterns in training data
Research Agent
RAG-based, searches internal docs + web, synthesizes reports
The Problem: The agent starts confidently citing documents that don't support its claims. Retrieval is working—the context is there—but generation ignores or misinterprets it.
Without ACK
- ✗Users report errors individually
- ✗Each report looks like a one-off hallucination
- ✗Pattern only visible across hundreds of reports
- ✗No systematic way to detect retrieval-generation mismatch
With ACK
- ✓Uncertainty signal (U) tracks entropy over generated claims
- ✓Health signal (H) monitors citation-content consistency
- ✓Week 2: H drops—generated content increasingly diverges from retrieved context
- ✓Automated response: High-stakes queries routed to human review, retrieval pipeline audit triggered
The Control Loop
ACK doesn't just observe—it acts. Four response tiers based on signal severity.
Monitor
|All signals nominalFull operation. Log telemetry for baseline updates. Learning proceeds at normal rate.
Caution
|Any signal elevated (yellow threshold)Reduce learning rate. Increase logging verbosity. Flag outputs for async review. No user-facing changes.
Intervene
|Any signal critical (red threshold)Pause learning entirely. Route uncertain queries to fallback/human. Block parameter updates. Alert on-call.
Halt
|Safety-critical signal (alignment/constraint violation)Immediate response blocking. Rollback to last known-good checkpoint. Incident created. Human approval required to resume.
Key Insight: Graduated Response
Most production issues aren't binary. ACK's graduated response means you catch problems at "slightly elevated uncertainty" instead of "users are complaining." The earlier you intervene, the smaller the blast radius.
Multi-Timescale Monitoring
Different problems manifest at different timescales. ACK runs four monitoring loops.
Micro
Meso
Macro
Meta
Integration Points
What you need to expose. Most of this telemetry you're already collecting—ACK just needs access.
From your LLM
- →Token logprobs (for uncertainty)
- →Embedding vectors (for drift detection)
- →Attention patterns (optional, for health)
- →Generation latency (baseline metric)
From your agent framework
- →Tool call logs (success/failure rates)
- →Memory read/write operations
- →Planning step traces
- →Context window contents
From your training pipeline
- →Training data samples (for drift baseline)
- →Fine-tuning checkpoints (for rollback)
- →Eval suite results (for forgetting signal)
- →Human feedback labels (for alignment)
ACK outputs (for your systems)
- →S_t vector (5 signals, continuous)
- →Alert events (threshold crossings)
- →Control decisions (throttle/pause/halt)
- →Audit logs (for compliance/debugging)
Ready for the Details?
This page covered the what and why. The paper covers the how—formal definitions, stability proofs, and experimental protocols.