Meet Tess

Instantaneous, judge-free feedback for LLM routing.

Inspired by Montessoris Control of Error and geometric ML: feasibility-first selection, SPD metric learning, and a Lyapunov-inspired trust-region. Claims below link to verifiable reports.

Mean

Regret (fixed WTP)

Rate

Win Rate (panel)

100%

Test Coverage (CI)

Cost Delta (wins)

View on GitHub How Tess Learns Read the Paper

Scroll to explore

How Tess Learns

Geometry meets pedagogy. Montessori meets machine learning.

Riemannian Geometry

Prompts and models live in a learned curved space (SPD Mahalanobis metric). Distance reflects semantic fit, not raw position.

Utility (free-energy) optimization

We balance quality, latency, cost, distance, and evidence in a single scalarized utility. See protocol.

Control of Error

Immediate, judge-free signals (feasibility, ambiguity, uncertainty) drive bounded updates. See Control of Error.

Boundary Detection

High entropy, low utility gap, high uncertainty ? flag for deferral. See decision curves in cs.CL.

Constraint Satisfaction

Feasibility-first selection enforces region/policy/capability limits. Approximate shadow prices are diagnostic. See constraints.

Adaptive Learning

Trust-region (Lyapunov-inspired) caps step size for online metric updates. Stability indicators in cs.SY.

Built for Science

Rigorous. Reproducible. Open.

Tests & QA

Hypothesis derandomized profile; invariants; mutation testing; coverage target 100%. See CI and determinism.

One-Shot Reproducible

Run scripts/run_peer_review.bat to regenerate tables, CEI, KPIs, and a consolidated HTML report with a SHA-256 manifest.

Peer-Review Ready

Docs for cs.LG, cs.CL, cs.SY, stat.ML; CEI; decision curves; control KPIs; fixed-? summaries. See documentation.

For Reviewers

Direct links to audience-specific notes.

cs.LG

Problem setup, decision rule, evaluation protocol, significance of instantaneous feedback.

cs.CL

Per-task routing behavior, decision curves, deferral quality, calibration.

cs.SY

Closed loop, trust-region controller, stability indicators, control KPIs.

stat.ML

Paired bootstrap, calibration, KDE/shrinkage, robustness notes.

SRMF ? Lyapunov (Compitum)

Instantaneous feedback, bounded updates, falsifiable claims.

Statement

In Compitum, the Self-Regulating Mapping Function (SRMF) plays the role of a Lyapunov functional for the discrete update map: metric updates decrease a surrogate energy via line search; the controllers integral decays under zero drift; stride separation isolates timescales.

See the brief: SRMF as a Lyapunov Functional.

Falsifiability (Tests)

Learning descent, zero-error decay, two-timescale isolation, and routing-level distance descent are tested under tests/invariants/.

Discipline Hubs

Repository: github.com/PaulTiffany/compitum

Core Science 0.1.1

Geometry Stability Coherence Constraints Determinism Pedagogy

Geometry: SPD bounds, triangle inequality, ray monotonicity, update descent
Stability: Lyapunov decay/saturation/recovery; dV proxy sequences; combined boundedness
Coherence: monotone outward, symmetry (±v), inward score direction, mixture discrimination
Constraints: feasibility monotone; duals slack0, boundary=0; monotone/scale sanity
Determinism: batch/repeated determinism; paraphrase flip budget + explainability
Pedagogy: practice raises evidence/utility (ßs>0); prepared environment fixes constraints

Run Invariants View Report