Control of Error¶

Definition (Montessori → ML)

In Montessori education, “control of error” means materials are designed so that learners can detect and correct their own mistakes without adult intervention (instant feedback, self‑correction, independence).
In Compitum, we design the routing process with built‑in, mechanistic signals that expose errors and enable self‑correction without an external judge model.

Mechanisms (Implemented)

Feasibility and Diagnostics
- Hard constraints (capabilities + AxB ≤ b) enforce compliance by construction; approximate local shadow prices are reported for auditing (src/compitum/constraints.py:36).
Ambiguity Detection
- Boundary condition using utility gap, softmax entropy, and uncertainty flags close calls for optional deferral (src/compitum/boundary.py:19).
Calibrated Uncertainty
- Utility variance aggregates calibrated component quantiles and distance variance (src/compitum/energy.py:33).
Stable Adaptation
- Lyapunov‑inspired trust‑region control (EMA + integral) caps update sizes; SPD metric updates maintain positive definiteness (src/compitum/control.py:15; src/compitum/metric.py:23,39,106).
Certification
- Each decision emits a routing certificate with all signals for immediate inspection (src/compitum/router.py:25,80).

Formal Properties (Operational)

Detect: boundary flag is predictive of higher regret than non‑boundary on average.
Comply: constraint violation rate ≈ 0 by construction (report empirical rate).
Correct: after drift/ambiguity spikes, trust‑region updates reduce a surrogate energy within K steps in practice.
Certify: certificates contain sufficient fields to audit detection and correction per decision.

Control‑of‑Error Index (CEI)

Define CEI as a normalized summary of four measurable components (higher is better):

Deferral quality: PR/ROC of boundary flag predicting top‑q regret units (or regret > τ).
Calibration: monotone trend of uncertainty buckets vs. absolute regret (reliability curve score).
Stability: regret reduction (or bounded change) around trust‑radius shrink events (pre/post windows).
Compliance: 1 − violation rate.

Usage

Compute from existing evaluation CSVs and certificate dumps; no judge model required.
Report CEI alongside regret and win‑rate at fixed WTP to evidence instantaneous, judge‑free feedback quality.
Helper script:

python tools\analysis\cei_report.py ^
  --input data\rb_clean\eval_results\<latest-compitum-csv>.csv ^
  --out-json reports\cei_report.json ^
  --out-md reports\cei_report.md

Options: --topq 0.1 for top‑quantile high‑regret labeling (default) or --tau <value> for an absolute threshold.

Notes

Shadow prices are approximate local diagnostics via finite‑difference viability; they are reported only and do not influence selection.
The coherence prior is KDE log‑density in whitened space; its influence is bounded by clipping and a small weight β_s.