Operations Runbook

Overview

  • Compitum emits structured certificates per decision (CLI --trace, API cert.to_json()). These fields are designed to map directly to logs and metrics for SRE/ops.

Logging (Structured)

  • Log per decision the following fields (JSON):

    • model, utility, utility_components.quality, utility_components.cost

    • constraints.feasible, constraints.shadow_prices

    • boundary_analysis.gap, boundary_analysis.entropy, boundary_analysis.sigma

    • drift_status.trust_radius, drift_status.ema

  • Example: see examples/cert_to_logging.py.

Metrics (Suggested)

  • Gauges/histograms:

    • Utility (U), quality, cost

    • Gap, entropy, sigma

    • Trust radius, EMA

  • Counters:

    • Feasible/infeasible decisions

    • Deferrals (if policy triggers on ambiguity)

  • Derived:

    • “At frontier” rate (gap ~ 0)

    • Active-constraint count (nonzero shadow prices)

Alerts (Initial Thresholds)

  • Constraint compliance < 99.9% over 5–15 min window

  • Prolonged high ambiguity:

    • Gap < 0.02 and Entropy > 0.8 for > 1% of decisions in 15 min

  • Drift tightness:

    • Trust radius persistently low (e.g., < 0.2) beyond N decisions

Dashboards (Minimal)

  • Efficiency: U, quality, cost (p50/p90)

  • Ambiguity: gap, entropy (p50/p90), at-frontier rate

  • Constraints: feasible rate, active constraint count, top shadow prices

  • Drift: trust radius and EMA trend

Run Procedures

  • Standard run:

    • Use fixed configs (defaults, constraints)

    • Log certificate JSON for each decision

    • Export metrics from logs via your pipeline (e.g., ELK/OTel)

  • Rollback:

    • Revert to previous frozen config (tagged release)

    • Reduce update stride or tighten trust radius if instability appears

Knobs (Tuning)

  • lambda (WTP): cost sensitivity

  • Metric params: D, rank, delta (stability)

  • Boundary thresholds: gap_threshold, entropy_threshold, sigma_threshold

  • Update cadence: update_stride

SRE Tests (Smoke)

  • Route a fixed prompt set and assert:

    • No infeasible certificates

    • U, gap, entropy within expected bands

    • Logs parse as valid JSON; metrics exporter sees fields

References