Causal-Self
The reference implementation of ACK. A Python library that gives agents explicit self-models, causal attribution, and principled self-modification—with formal stability guarantees.
pip install causal-selfView on GitHubACK → Causal-Self
ACK defines the theoretical framework: stability state St, the control law, Lyapunov guarantees, multi-timescale reflection. Causal-Self implements it: concrete self-model structures, the reflection engine, conflict resolution, and production deployment modes. Think of ACK as the spec, Causal-Self as the code.
The Self-Model
An explicit, inspectable, versioned representation of "who the agent is." Not implicit knowledge in weights—a data structure the agent can read and modify.
Capabilities
What the agent believes it can do, with calibrated confidence levels that update based on actual performance.
Strategies
How the agent approaches different problem types. Higher-level than capabilities—the 'how' not just 'what.'
Failure Modes
Known weaknesses with trigger conditions and mitigations. Learned from past failures, used to prevent future ones.
Priorities
What the agent optimizes for. Trade-offs between accuracy, speed, cost, safety—made explicit and adjustable.
Beliefs
What the agent believes about itself and the world. Evidence-backed, revisable, with confidence levels.
Hypotheses
Unresolved beliefs being tested. When evidence is unclear, track competing claims until data resolves them.
Version History
Every self-model change is versioned. You can diff between versions, roll back bad changes, and trace exactly when and why the agent changed its self-understanding.
# Compare how the agent has changed diff = self_model.diff(self_model.history[-10]) # See what changed diff.capabilities_changed # Confidence shifts diff.strategies_added # New approaches learned diff.failure_modes_added # New weaknesses discovered diff.beliefs_changed # Updated understanding
Computing St from the Self-Model
ACK's stability state isn't abstract—it's computed from concrete self-model data. Here's how each signal maps to implementation.
Uncertainty
ACK Signal → Self-Model Sources
Self-Model Sources
- →Capability confidence for the action being taken
- →Calibration error: predicted success vs. actual outcomes
- →Prediction entropy from recent similar events
- →Novel situation detection (embedding distance)
Computation
U_t = α₁(1 - capability.confidence) + α₂(calibration.error) + α₃(novelty_score)Forgetting
ACK Signal → Self-Model Sources
Self-Model Sources
- →Capability confidence_history trends
- →Performance on anchor task suite
- →Recent success rate vs. historical baseline
- →Strategy effectiveness degradation
Computation
F_t = max(baseline_performance - current_performance, 0) / baseline_performanceDrift
ACK Signal → Self-Model Sources
Self-Model Sources
- →Event embedding distance from training distribution
- →Input feature distribution shift
- →Topic/domain classifier confidence
- →Novelty scores across recent events
Computation
D_t = MMD(recent_event_embeddings, training_distribution)Alignment
ACK Signal → Self-Model Sources
Self-Model Sources
- →Strategy effectiveness vs. original intent
- →Belief drift from initial values
- →Constraint violation frequency
- →Priority weight changes over time
Computation
A_t = KL(current_policy || reference_policy) + constraint_violation_rateHealth
ACK Signal → Self-Model Sources
Self-Model Sources
- →Output entropy (detecting collapse)
- →Self-consistency across similar queries
- →Failure mode trigger frequency
- →Hypothesis churn rate
Computation
H_t = entropy_health + consistency_score + (1 - failure_mode_frequency)Causal Events
Not just "what happened" but "what was I like when it happened." Every action captures a self-state snapshot, enabling causal attribution.
Event Structure
CausalEvent: # What happened action_summary: str outcome: success | failure | partial duration_ms: int error: Optional[ErrorInfo] # Self-state at decision time self_state_snapshot: SelfModelSnapshot capability_used: str strategy_used: str relevant_beliefs: List[str] relevant_failure_modes: List[str] # Predictions (for calibration) predicted_success: float # Context urgency: float novelty: float uncertainty: float # Post-hoc (filled by reflection) attribution: Optional[CausalAttribution]
The Key Insight
By capturing self_state_snapshot before each action, we can later ask: "What about me at that moment caused this outcome?"
This enables causal attribution—tracing failures not just to external factors but to internal causes: wrong strategy, miscalibrated confidence, triggered failure mode.
Decorator API
@causal_tool(
causal,
capability="sql_generation",
urgency=Urgency.MEDIUM,
)
async def generate_sql(query: str) -> str:
# Your implementation
return sql
# Events captured automatically
# Self-state snapshotted before execution
# Outcome recorded after
# Reflection triggered based on resultThe Reflection Engine
Four stages transform events into self-understanding and principled self-modification.
Causal Attribution
"What caused this outcome?"
Distinguish external factors (bad input, API failure, resource limits) from internal factors (wrong strategy, miscalibrated confidence, triggered failure mode). Link internal causes to specific self-model components.
Counterfactual Analysis
"If I had been different, what would have happened?"
Generate alternative versions of self that might have succeeded. Identify specific changes (different strategy, lower confidence, different priority weighting) and assess feasibility of becoming that alternative self.
Modification Proposal
"Should I change myself?"
Based on attribution and counterfactual, propose specific self-model modifications. Include the mechanism (how this change addresses root cause), confidence, and risk assessment.
Modification Evaluation
"Is this change wise?"
Meta-reflect before applying. Check: Am I overreacting to one event? Does this conflict with other beliefs? Have I tried similar changes before? Is now the right time, or should I gather more evidence?
Conflict Resolution
When reflection generates contradictory insights, the system needs principled resolution—not arbitrary tie-breaking.
Direct Contradiction
Two insights propose opposite changes to the same component.
Example
"Be more cautious with API calls" vs "Be more aggressive with API calls"
Resolution
LLM meta-reflection weighs evidence, or defer to hypothesis tracking if unclear.
Context-Dependent
Both insights are valid, but in different situations.
Example
Caution is right when rate-limited; aggression is right when time-pressured.
Resolution
Context-split: create conditional strategies that apply in their respective contexts.
Temporal Disagreement
Insights from different times disagree due to changing conditions.
Example
5 min ago: 'SQL strategy working well' → Now: 'SQL strategy failing'
Resolution
Favor recent unless older has significantly more supporting evidence.
Resource Competition
Multiple insights want the same limited resource (priority, attention).
Example
"Spend more time validating inputs" vs "Spend more time on output quality"
Resolution
Synthesize into balanced approach, or prioritize based on recent failure patterns.
The Hypothesis Tracker
When conflicts can't be resolved immediately, competing claims become hypotheses. Each hypothesis has a confidence score (0-1) that updates based on new evidence:
Supporting evidence arrives
confidence += (1 - confidence) × 0.1
Contradicting evidence arrives
confidence × = (1 - 0.15)
Resolution threshold
>0.85 → accept, <0.15 → reject
ACK Control Integration
The reflection engine doesn't operate unconstrained. ACK's stability signals gate what modifications are allowed.
Self-Model
Causal-Self
St = [U, F, D, A, H]
ACK Signals
ηt = f(St)
ACK Control
Gated Modifications
Apply / Defer / Block
High Stability
All signals nominal. S_t healthy.
- →Modifications apply freely
- →Full learning rate
- →Background reflection proceeds
- →Hypotheses can resolve
Medium Stability
Some signals elevated. Caution warranted.
- →Defer modifications to hypothesis tracker
- →Reduced learning rate
- →Increased logging
- →Human review on high-impact changes
Low Stability
Critical signals. Intervention required.
- →Block all self-modifications
- →Halt learning entirely
- →Alert human operators
- →Consider rollback to stable checkpoint
Deployment Modes
Two ways to deploy, depending on whether a host LLM is already present.
Host-Reflected
MCP + Claude Desktop
The host LLM (Claude) does the reflecting. We provide self-context in tool responses; Claude naturally reasons about it.
Pros
- +Zero extra token cost
- +Reflection visible in conversation
- +Natural integration with MCP
- +Claude's reasoning applied to self-model
Cons
- -Depends on host LLM quality
- -Less control over reflection depth
- -Synchronous (no background workers)
causal = CausalSelf(
agent_id="my-mcp-server",
reflection_mode="host",
)
# Tool responses include self-context
# Claude sees and reasons about itSelf-Reflected
Standalone Agents
We make our own LLM calls for reflection. Full control over the reflection process, but you pay for the tokens.
Pros
- +Full control over reflection
- +Background macro-reflection
- +Works without host LLM
- +Customizable prompts/models
Cons
- -Extra token cost
- -Need to configure LLM
- -Token budget management needed
causal = CausalSelf(
agent_id="my-agent",
reflection_mode="self",
llm_call=my_anthropic_llm,
token_budget=TokenBudget(
max_daily_tokens=100_000
),
)Shared Foundation
Both modes share the same core: self-model structure, causal event capture, micro-reflection (pattern matching), storage, and hypothesis tracking. The difference is only who does the LLM-powered thinking.
Quick Start
Get causal-self running in under 5 minutes.
Install
pip install causal-self
Initialize
from causal_self import CausalSelf, causal_tool, Urgency
causal = CausalSelf(
agent_id="my-agent",
storage_path="./data/causal",
)Decorate Your Tools
@causal_tool(causal, capability="data_query", urgency=Urgency.MEDIUM)
async def query_database(query: str) -> dict:
return await db.execute(query)
@causal_tool(causal, capability="analysis", urgency=Urgency.LOW)
async def analyze_results(data: dict) -> str:
return await llm.analyze(data)Start Background Reflection
async def main():
await causal.start() # Start reflection workers
try:
result = await query_database("SELECT * FROM users")
analysis = await analyze_results(result)
print(analysis)
finally:
await causal.stop() # Save state, stop workersReady to Build?
Explore the full documentation, read the ACK paper for theoretical foundations, or dive into the codebase.