HYPERLEEZUS
System Transparency

How It Works

Hyper Leezus runs a fully automated ML pipeline — ingesting real odds and game results every two hours, retraining models nightly, and blending statistical predictions with live market consensus to surface genuine edges.

The Pipeline

Step 1
Every 2 hours

Data Collection

  • Live pre-game odds from 20+ sportsbooks via The-Odds-API
  • Completed game scores stored with matching game UUIDs
  • Consensus win probability averaged across all bookmakers
  • ESPN historical scores backfilled for rolling team stats
Step 2
Daily at 4 AM

Model Training

  • 10-game rolling team averages prevent data leakage
  • XGBoost classifier outputs win probability per league
  • Gradient Boosting regressors predict home & away scores
  • 5 leagues trained independently: NBA, NHL, MLB, NFL, NCAAB
Step 3
Real-time

Prediction & Edge

  • ML output blended 60% with 40% market consensus probability
  • Edge flagged when blended probability deviates >3.5% from market
  • O/U edge detected when projected total deviates >4% from league avg
  • Confidence derived from how far probability sits from 50%

Live Model Status

Updated after each training run
LeagueStatusTraining SamplesAccuracyLog LossCalibration (ECE)Est. ROILast Trained

Accuracy ≥ 55% and positive ROI indicate the model is beating the market. Calibration (ECE) measures how well predicted probabilities match actual outcomes — lower is better.

Training Features

13 features across 4 categories feed each league's model. All performance features use rolling 10-game averages computed from games before the target game — no data leakage.

Performance
Power Rating Diff

Rolling avg point differential (home minus away), last 10 games

Offensive Rating Diff

Points per possession differential across recent games

Defensive Rating Diff

Points allowed per possession differential across recent games

Pace Differential

Possessions per minute difference — predicts total scoring

Situational
Rest Days Diff

Days of rest between games, home minus away

Travel Fatigue

Miles traveled divided by rest days — penalizes cross-country back-to-backs

Injury Impact Diff

Summed injury impact scores (out=1.0, questionable=0.45) per team

Market Signals
Market Implied Prob

Consensus home win probability averaged across all bookmakers (vig-adjusted)

Line Movement

Spread change from open — sharp movement signals informed action

Public Betting %

Percentage of public bets on home team — contrarian signal

Sharp Money %

Percentage of sharp (high-limit) bets on home — strongest market signal

Environmental
Sentiment Diff

Reddit post sentiment score for home team minus away team

Weather Severity

Composite of wind speed, precipitation, and temperature deviation (outdoor sports only)

The 60 / 40 Blend

Why not use the ML model at 100%?

60% ML
40% Market

Sportsbook lines aggregate information from thousands of sharp bettors and professional syndicates. A model trained on weeks of data cannot systematically beat that signal — but it can add value on top of it.

By blending, the system inherits the market's information advantage while letting the model contribute where market odds are slow to adjust: rest differentials, travel schedules, and recent form.

Edge Detection

How picks are surfaced

Moneyline Edge
blended_prob − market_implied_prob > 3.5%

Model sees home team as meaningfully more likely than the market implies.

Spread Edge
|spread_diff| > 1.5 pts when confidence > 62%

Predicted margin deviates from the line and the model is confident.

O/U Edge
|projected_total − league_avg| / league_avg > 4%

Projected total deviates significantly from the league season average — mean reversion signal.