Learning Perspective (cs.LG)

This note presents Compitum in standard ML terms: a constrained contextual routing problem with scalarized utility, calibrated uncertainty, a learned SPD geometry, and bounded online updates.

Problem Setup

  • Context x in R^D and pragmatic features x_B in R^4.

  • Model set M. Each model m in M has base cost and a center mu_m in feature space.

  • Utility at fixed willingness-to-pay lambda: U(x, m) = quality - lambda*(latency + cost) - beta_ddist_M(x, mu_m) + beta_slog p_M(x).

  • Constraints: A x_B <= b and capabilities; enforced before selection.

Code anchors: src/compitum/energy.py:33 (utility), src/compitum/metric.py:23,39 (SPD metric and distance), src/compitum/coherence.py:41 (KDE prior), src/compitum/constraints.py:36 (feasibility).

Decision Rule

  • Feasibility-first: filter m by capabilities and A x_B <= b.

  • Selection: choose argmax_m U(x, m) among feasible.

  • Certificate: emit utility components, feasibility and approximate local shadow prices (finite-difference diagnostics), boundary diagnostics (gap/entropy/uncertainty), and trust-region state; see src/compitum/router.py:25,80.

Learning Components

  • Geometry: low-rank SPD Mahalanobis metric M = L L^T + delta*I; PD guaranteed by construction. Defensive Cholesky and delta adjustment; src/compitum/metric.py:23,39.

  • Online adaptation: surrogate gradient in L with trust-region step cap (EMA + integral); src/compitum/metric.py:106, src/compitum/control.py:15. We say “Lyapunov-inspired”; no formal proof claimed.

  • Predictors and calibration: component regressors with isotonic calibration and quantile bounds; src/compitum/predictors.py.

  • Coherence prior: KDE log-density in whitened coordinates under M; clipped to bound influence; src/compitum/coherence.py:41. beta_s sweeps show robustness.

Relation to ML Literature

  • Constrained contextual routing: feasibility-first selection resembles constrained contextual bandits/knapsacks, with scalarized objectives at fixed lambda.

  • Selective classification/deferral: boundary ambiguity (gap/entropy/uncertainty) aligns with abstention strategies to improve risk-sensitive utility.

  • Metric learning: Mahalanobis (ITML/LMNN-style) geometry with low-rank structure and online updates.

  • Calibration: isotonic regression and reliability curves for uncertainty evaluation.

  • Density priors: KDE as an auxiliary regularizer; bounded effect via clipping and small beta_s.

Evaluation Protocol

  • Fixed-lambda slices: lambda in {0.1, 1.0} (headline) and a sensitivity grid for frontier plots.

  • Pairing: compute deltas per evaluation unit against best baseline; aggregate via panel means; bootstrap CIs.

  • Metrics: regret, win-rate, compliance rate; calibration (reliability curves, Spearman rho(uncertainty, |regret|)); deferral quality (boundary AUROC/AP); stability (rho(shrink in trust-radius, future regret decrease)).

Helpers: tools/analysis/cei_report.py, tools/analysis/reliability_curve.py, tools/analysis/control_kpis.py. Fixed-lambda summary (paired bootstrap):

python tools\analysis\lg_summary.py ^
  --input data\rb_clean\eval_results\<latest-compitum-csv>.csv ^
  --lambdas 0.1 1.0 ^
  --bootstrap 1000 ^
  --out-json reports\lg_summary.json ^
  --out-md reports\lg_summary.md

Robustness and Limits

  • Robustness: modest beta_s and rank sweeps do not flip headline conclusions; coherence is bounded by clipping; update stride affects throughput, not correctness.

  • Limits: shadow prices are approximate local diagnostics (report-only; not KKT duals). Trust-region is Lyapunov-inspired without formal proof.

Reproducibility

  • Determinism: Hypothesis derandomized in CI; demo seeds fixed.

  • Snapshot env with pip freeze and system info; attach CEI, reliability curve, control KPIs, and fixed-lambda summaries.

Instantaneous Feedback vs. Delayed Reward (Significance)

  • Mechanistic surrogates at decision time

    • At step t, Compitum observes judge-free, endogenous signals S_t = {feasibility, gap, entropy, uncertainty, trust-radius} and applies a bounded update. No temporal credit assignment is required; there is no reward delay.

    • In contrast, bandit/RL settings often observe a delayed, noisy reward R_{t+tau}, requiring credit assignment across tau and increasing estimator variance.

  • Variance and latency intuition

    • Let theta be parameters of the geometry (L). Updates use grad_theta U(x_t, m; theta_t) with a capped step (trust-region). Using S_t as full-information surrogates reduces update latency and empirically reduces variance relative to delayed bandit feedback.

    • We do not claim new bounds; we make an empirical hypothesis: near-zero-latency internal signals lead to faster regret reduction under bounded steps than delayed judge signals.

  • Empirical proxies (what to report)

    • CEI: deferral quality (boundary vs. high-regret), calibration (rho(uncertainty, |regret|)), stability (rho(shrink, future improvement)), compliance (~100%).

    • Control KPIs: shrink/expand events and the correlation between shrink events and future regret decrease.

    • Reliability curve: monotone increase of |regret| with uncertainty bins.

  • Free-energy view (engineering)

    • The update loop minimizes a scalarized “free-energy” objective at decision time (quality - lambdacost - beta_ddistance + beta_s*log-density). Instantaneous feedback reduces friction in that optimization by eliminating reward delay; the trust-region caps steps to maintain stability.

  • Claim boundary (honest positioning)

    • Same data-generation process as RL, different estimator: we use full-information, judge-free surrogates available at decision time. We do not introduce a judge model or delayed reward; conclusions are empirical and robust across lambda, beta_s, and rank sweeps.

Geometry & Coherence Evidence (0.1.1)

  • Geometry

    • SPD eigenvalue bounds, triangle inequality, ray monotonicity, multi‑step update descent.

    • Tests: tests/metric/test_metric_spd_bounds.py, tests/invariants/test_invariants_metric_triangle.py, tests/invariants/test_invariants_metric_ray.py, tests/invariants/test_invariants_metric_update.py

  • Coherence

    • Monotone outward on isotropic clouds; symmetry (±v); inward score direction (finite diff); mixture discrimination.

    • Tests: tests/invariants/test_invariants_coherence.py, tests/invariants/test_invariants_coherence_symmetry.py, tests/invariants/test_invariants_coherence_score_dir.py, tests/coherence/test_coherence_mixture_discrimination.py