Feedback Loops for AI Agent Systems: From Open-Loop to Self-Improving
Most AI agent systems are open-loop: they execute a task and terminate. There is no measurement of whether the output was good, no comparison to previous outputs, and no mechanism to improve. Over time, quality drifts, costs creep up, and nobody notices until something breaks visibly.
Closed-loop agent systems measure, reflect, act, and verify. They improve with every cycle. Here is how to build them.
The Four-Phase Loop
Every feedback loop follows the same structure:
- MEASURE: Compute the gap between desired state and actual state. Be specific -- "quality is low" is not a measurement. "3 of 8 endpoints return incorrect data" is.
- REFLECT: Did last cycle's actions work? Has the target itself changed? Did anything regress? At current rate, when will the loop terminate? This phase is mandatory and cannot be skipped.
- ACT: Address one severity tier only. Do not scope-creep into lower-priority issues mid-cycle.
- VERIFY: Re-measure. Compute the delta. Did we actually make progress?
Six Categories of Loops
Completeness Loops (L1-L3)
Close the gap between what SHOULD exist and what DOES exist. Feature completeness, API surface completeness, integration completeness.
Parity Loops (L4-L6)
Close the gap between parallel artifacts that should match. Feature parity across platforms, documentation matching code, test coverage matching modules.
Standards Loops (L7-L9)
Close the gap between current quality and target quality. Design conformance, quality elevation, performance baselines.
Progress Loops (L10-L12)
Close the gap between planned milestones and actual delivery. Project fulfillment, milestone convergence, scope integrity.
Meta-Evolution Loops (L13-L15)
Close the gap between current process quality and optimal process quality. These are second-order loops -- they improve the PROCESS of improving. Change quality scoring, enhancement velocity tracking, capability growth mapping.
Safeguard Loops (L16-L18)
Close the gap between system health and stability requirements. Regression detection, debt tracking, convergence termination.
The Reflection Phase Is Not Optional
The most common mistake in feedback loop implementation is skipping reflection. Without it, you cannot distinguish between convergence and churn.
Four mandatory reflection questions:
- Delta accountability: Did last cycle's actions achieve their predicted effect?
- Drift detection: Has the specification itself changed since last measurement?
- Regression check: Did any previously passing requirement break?
- Trajectory assessment: At current rate, how many cycles until termination?
Stall Detection
A loop with zero net progress for 3 consecutive cycles is stuck. Without automatic escalation, it runs forever, consuming budget. Always implement stall detection with automatic escalation to a human or a different strategy.
Meta-Evolution: Improving the Improvement Process
The most advanced feedback pattern: loops that improve OTHER loops. The METACYCLE loop measures Change Quality Score (CQS) across all changes and adjusts process levers -- review criteria, pre-commit gates, coding conventions -- based on what the data shows.
The Enhancement Velocity Index (EVI) measures both quantity and quality: EVI = count * mean_impact_score. A team shipping 25 trivial changes has high count but low EVI. A team shipping 3 perfect changes has high impact but low EVI. Only the product captures real velocity.
All 18 loop types are detailed with implementation templates in the Protocol Playbook.