Learning Perspective (cs.LG)¶
This note presents Compitum in standard ML terms: a constrained contextual routing problem with scalarized utility, calibrated uncertainty, a learned SPD geometry, and bounded online updates.
Related: cs.CL · cs.SY · stat.ML · SRMF ⇄ Lyapunov · Peer Review Protocol · Certificate Schema
Problem Setup¶
Context x in R^D and pragmatic features x_B in R^4.
Model set M. Each model m in M has base cost and a center mu_m in feature space.
Utility at fixed willingness-to-pay lambda: U(x, m) = quality - lambda*(latency + cost) - beta_ddist_M(x, mu_m) + beta_slog p_M(x).
Constraints: A x_B <= b and capabilities; enforced before selection.
Code anchors: src/compitum/energy.py:33 (utility), src/compitum/metric.py:23,39 (SPD metric and distance), src/compitum/coherence.py:41 (KDE prior), src/compitum/constraints.py:36 (feasibility).
Decision Rule¶
Feasibility-first: filter m by capabilities and A x_B <= b.
Selection: choose argmax_m U(x, m) among feasible.
Certificate: emit utility components, feasibility and approximate local shadow prices (finite-difference diagnostics), boundary diagnostics (gap/entropy/uncertainty), and trust-region state; see src/compitum/router.py:25,80.
Learning Components¶
Geometry: low-rank SPD Mahalanobis metric M = L L^T + delta*I; PD guaranteed by construction. Defensive Cholesky and delta adjustment; src/compitum/metric.py:23,39.
Online adaptation: surrogate gradient in L with trust-region step cap (EMA + integral); src/compitum/metric.py:106, src/compitum/control.py:15. We say “Lyapunov-inspired”; no formal proof claimed.
Predictors and calibration: component regressors with isotonic calibration and quantile bounds; src/compitum/predictors.py.
Coherence prior: KDE log-density in whitened coordinates under M; clipped to bound influence; src/compitum/coherence.py:41. beta_s sweeps show robustness.
Relation to ML Literature¶
Constrained contextual routing: feasibility-first selection resembles constrained contextual bandits/knapsacks, with scalarized objectives at fixed lambda.
Selective classification/deferral: boundary ambiguity (gap/entropy/uncertainty) aligns with abstention strategies to improve risk-sensitive utility.
Metric learning: Mahalanobis (ITML/LMNN-style) geometry with low-rank structure and online updates.
Calibration: isotonic regression and reliability curves for uncertainty evaluation.
Density priors: KDE as an auxiliary regularizer; bounded effect via clipping and small beta_s.
Evaluation Protocol¶
Fixed-lambda slices: lambda in {0.1, 1.0} (headline) and a sensitivity grid for frontier plots.
Pairing: compute deltas per evaluation unit against best baseline; aggregate via panel means; bootstrap CIs.
Metrics: regret, win-rate, compliance rate; calibration (reliability curves, Spearman rho(uncertainty, |regret|)); deferral quality (boundary AUROC/AP); stability (rho(shrink in trust-radius, future regret decrease)).
Helpers: tools/analysis/cei_report.py, tools/analysis/reliability_curve.py, tools/analysis/control_kpis.py. Fixed-lambda summary (paired bootstrap):
python tools\analysis\lg_summary.py ^
--input data\rb_clean\eval_results\<latest-compitum-csv>.csv ^
--lambdas 0.1 1.0 ^
--bootstrap 1000 ^
--out-json reports\lg_summary.json ^
--out-md reports\lg_summary.md
Robustness and Limits¶
Robustness: modest beta_s and rank sweeps do not flip headline conclusions; coherence is bounded by clipping; update stride affects throughput, not correctness.
Limits: shadow prices are approximate local diagnostics (report-only; not KKT duals). Trust-region is Lyapunov-inspired without formal proof.
Reproducibility¶
Determinism: Hypothesis derandomized in CI; demo seeds fixed.
Snapshot env with
pip freezeand system info; attach CEI, reliability curve, control KPIs, and fixed-lambda summaries.
Instantaneous Feedback vs. Delayed Reward (Significance)¶
Mechanistic surrogates at decision time
At step t, Compitum observes judge-free, endogenous signals S_t = {feasibility, gap, entropy, uncertainty, trust-radius} and applies a bounded update. No temporal credit assignment is required; there is no reward delay.
In contrast, bandit/RL settings often observe a delayed, noisy reward R_{t+tau}, requiring credit assignment across tau and increasing estimator variance.
Variance and latency intuition
Let theta be parameters of the geometry (L). Updates use grad_theta U(x_t, m; theta_t) with a capped step (trust-region). Using S_t as full-information surrogates reduces update latency and empirically reduces variance relative to delayed bandit feedback.
We do not claim new bounds; we make an empirical hypothesis: near-zero-latency internal signals lead to faster regret reduction under bounded steps than delayed judge signals.
Empirical proxies (what to report)
CEI: deferral quality (boundary vs. high-regret), calibration (rho(uncertainty, |regret|)), stability (rho(shrink, future improvement)), compliance (~100%).
Control KPIs: shrink/expand events and the correlation between shrink events and future regret decrease.
Reliability curve: monotone increase of |regret| with uncertainty bins.
Free-energy view (engineering)
The update loop minimizes a scalarized “free-energy” objective at decision time (quality - lambdacost - beta_ddistance + beta_s*log-density). Instantaneous feedback reduces friction in that optimization by eliminating reward delay; the trust-region caps steps to maintain stability.
Claim boundary (honest positioning)
Same data-generation process as RL, different estimator: we use full-information, judge-free surrogates available at decision time. We do not introduce a judge model or delayed reward; conclusions are empirical and robust across lambda, beta_s, and rank sweeps.
Geometry & Coherence Evidence (0.1.1)¶
Geometry
SPD eigenvalue bounds, triangle inequality, ray monotonicity, multi‑step update descent.
Tests:
tests/metric/test_metric_spd_bounds.py,tests/invariants/test_invariants_metric_triangle.py,tests/invariants/test_invariants_metric_ray.py,tests/invariants/test_invariants_metric_update.py
Coherence
Monotone outward on isotropic clouds; symmetry (±v); inward score direction (finite diff); mixture discrimination.
Tests:
tests/invariants/test_invariants_coherence.py,tests/invariants/test_invariants_coherence_symmetry.py,tests/invariants/test_invariants_coherence_score_dir.py,tests/coherence/test_coherence_mixture_discrimination.py