Validation & Diagnostics

In Progress

This section provides empirical evidence that the system behaves according to the specifications established in the Methodology. Validation focuses on reliability and correctness, not on predictive performance or outcome optimization.

Relationship to Methodology: The Methodology & Scope section defines what the system is designed to do and the assumptions under which it operates. This Validation section provides empirical evidence that the system behaves according to those specifications. Methodology establishes the contract; Validation verifies adherence to that contract.

Data Integrity Verification

Property Validated

Input data is complete, correctly formatted, and free from corruption. Missing values are identified and handled according to documented imputation rules.

Why It Matters

All downstream computations depend on data integrity. Corrupted or missing data can propagate errors through the entire system. Data QA ensures that the foundation upon which all analysis rests is sound.

Verification Method

• Null/undefined field counts across all data sources
• Schema validation against expected field types
• Coverage disclosure (champions, players, games)
• Imputation verification for handled missing values

Module: Data QA — Validates input data integrity before any model computations occur.

Model Calibration Assessment

Property Validated

Probability outputs are well-calibrated: when the model assigns 70% probability to an outcome, that outcome should occur approximately 70% of the time in held-out data.

Why It Matters

Calibration ensures that probability outputs can be interpreted at face value. Poorly calibrated models produce probabilities that do not correspond to actual frequencies, undermining their utility for decision support.

Verification Method

• Temporal split: train on earlier patches, test on later patches
• Expected Calibration Error (ECE) computation
• Log loss and Brier score on held-out data
• Prior strength sensitivity analysis

Module: M1 (Role Posterior) — Validates Bayesian role probability calibration using temporal holdout evaluation.

Temporal Stability Analysis

Property Validated

System outputs remain stable over time. Patterns identified in earlier data persist in later data at acceptable rates, indicating that the system captures durable signals rather than transient noise.

Why It Matters

Temporal stability distinguishes signal from noise. If player pools or context adjustments change dramatically between time periods, the system may be fitting to noise rather than capturing meaningful patterns.

Verification Method

• Recall@K: fraction of top-K items in test period that appeared in train period
• Bootstrap stability: variance of outputs under resampling
• Context filter stability across patch boundaries

Modules: M2 (Context Filter), M4 (Player Pool) — Validate that context adjustments and player pools remain stable over time.

Conservatism Enforcement

Property Validated

The system applies appropriate conservatism: small samples do not produce strong signals, low-sample players cannot generate STRONG evidence, and fallback to baseline occurs when data is insufficient.

Why It Matters

Conservatism prevents over-assertion from limited data. Without proper gating, the system could surface misleading signals based on statistical noise, undermining trust and decision quality.

Verification Method

• Low-sample gating: verify players with <10 games produce no STRONG evidence
• Fallback correctness: verify small-sample contexts return global baseline
• Monotonicity: verify higher raw lift produces higher scores (not inverted)
• Conservatism gap: measure obs vs obsLower difference by sample size

Modules: M2 (Context Filter), M3 (Threat Signals), M4 (Player Pool) — Validate fallback behavior, monotonicity, and low-sample gating.

Action Safety Verification

Property Validated

The draft state machine operates correctly: phase transitions follow valid sequences, actions are only permitted in appropriate phases, and evidence attribution is deterministic (same inputs always produce same outputs).

Why It Matters

Action safety ensures the system cannot enter invalid states or produce inconsistent outputs. Determinism is essential for debugging, auditing, and building trust in system behavior.

Verification Method

• State machine tests: verify valid phase transitions
• Action safety: verify actions are rejected in invalid phases
• Determinism: verify identical inputs produce identical outputs
• Boundary tests: verify edge cases are handled correctly

Modules: M5 (Evidence Trace), M6 (Draft Decision) — Validate determinism, boundary correctness, and state machine safety.

Validation Scope Limitations

The validation suite verifies system reliability, not predictive performance. The following are explicitly outside the scope of validation:

×
Win Rate Prediction
No claims are made about draft quality or game outcomes.
×
Meta Forecasting
All analysis is historical; no predictions of future meta states.
×
UI Rendering
Validation covers data and logic layers only.
×
Causal Relationships
All outputs are correlational, not causal.

CLI

Running the Validation Suite

# Run all validations
npm run validate:all

# Run individual modules
npm run validate:data-qa
npm run validate:m1
npm run validate:m2
npm run validate:m3
npm run validate:m4
npm run validate:m5
npm run validate:m6

Reports are generated in app/docs/validation/