Uber LangEffect: Agent Side Effect Management
Key Takeaways
- LangEffect: Uber’s internal framework for tracking and reversing agent-caused side effects
- Problem: agents executing multi-step tasks create side effects that are hard to rollback atomically
- Solution: effect log + compensating transactions pattern borrowed from distributed systems
- Saga pattern applied to agent workflows: each action has a compensating undo action
- Production result: 99.2% successful rollback rate for interrupted agent tasks at Uber
Summary
Uber’s infrastructure team developed LangEffect to solve a specific production problem: agents executing complex, multi-step tasks on production systems (rider matching, driver assignment, payment processing) would sometimes fail mid-task, leaving systems in inconsistent states. Unlike traditional software transactions, agent tasks are long-running and involve external API calls that don’t support traditional ACID rollback.
LangEffect adapts the distributed systems Saga pattern for agent workflows. The core idea: each action an agent takes is registered in an effect log with two entries — the forward action and its compensating transaction (an undo operation). If the agent task fails or is interrupted, LangEffect executes the compensating transactions in reverse order, returning the system to a known-good state.
Implementation details: agents are instrumented via a middleware layer that intercepts tool calls and registers them in the effect log before execution. Compensating transactions are either: (1) pre-specified for known operations (cancel a payment → refund), (2) generated by the LLM for novel operations (with human review for high-value compensations), or (3) flagged as “non-compensable” requiring human intervention.
The 99.2% rollback success rate is measured on production incidents over 6 months. The 0.8% failure cases are non-compensable operations where external systems had already processed the agent’s actions beyond the point of reversal.