Version 6.1 Update
Building an Independent ELO System with Dynamic Home Advantage Adjustment
This update documents a significant overhaul of our ELO-based betting model. We’ve removed dependency on external APIs, rebuilt our historical data from scratch, and introduced a mathematically-grounded home advantage adjustment. The result is a more accurate, self-sufficient system that properly accounts for venue when calculating fair odds.
Key changes:
- Removed ClubELO API dependency (was causing data corruption)
- Created independent ELO calculator with margin-of-victory enhancement
- Rebuilt 5 years of historical ELO data from 1,965 matches
- Introduced dynamic home/away probability adjustment (+11% home / -11% away)
- Added rolling calculation so adjustment factors update with new data
The Problem We Discovered
While preparing Matchweek 17’s analysis, I noticed something deeply wrong with our data. Liverpool’s elo_change_last_10 was showing -72, which seemed plausible given their recent form. But Spurs showed +309 over 10 games. That’s not a form swing - that’s a data corruption signal.
Digging deeper revealed the source: our ClubELO API integration was failing intermittently, and when it failed, it was doing so silently. The API would sometimes return ratings around 2000, sometimes around 1800, and occasionally default to 1500 when completely unavailable.
The Damage
I ran an analysis across our entire elo_history.json and found:
| Issue | Count |
|---|---|
| Swings > 50 points (single day) | 659 |
| Swings > 100 points | 412 |
| Swings > 200 points | 89 |
| Suspicious 1500 values | 9 |
| Largest single swing | +266 |
For context, a legitimate ELO swing from a single match maxes out around 30-35 points (for an extreme upset with a large margin of victory). We had 89 instances of swings exceeding 200 points. The data was unusable.
Here’s what Arsenal’s ELO looked like on consecutive days:
2024-03-15: 2037
2024-03-16: 1809
2024-03-17: 2027
2024-03-18: 1808
The ClubELO API was returning data from two different rating systems on alternating days. Our historical probabilities were being calculated on garbage.
The Solution: Independence
Rather than find a more reliable API, I decided to make the model fully independent. We have match data going back to 2020. We have the ELO formula. Why rely on external sources at all?
The ELO Formula
The standard ELO update formula is:
New Rating = Old Rating + K × (Actual - Expected)
Where:
- K is the sensitivity factor (we use K=20)
- Actual is the match result (1 for win, 0.5 for draw, 0 for loss)
- Expected is the pre-match probability based on rating difference
The expected score is calculated as:
Expected = 1 / (1 + 10^((Opponent Rating - Your Rating) / 400))
Adding Margin of Victory
Standard ELO treats all wins equally. A 1-0 scrappy win counts the same as a 5-0 demolition. This loses information. We enhanced the formula with a margin-of-victory (MOV) multiplier:
def calculate_mov_multiplier(goal_diff: int, elo_diff: int) -> float:
"""
Margin of victory multiplier.
- Close games (1-goal margin): multiplier < 1.0
- Comfortable wins (2-3 goals): multiplier ≈ 1.0-1.3
- Thrashings (4+ goals): multiplier up to 1.8
- Upsets get additional boost
"""
if goal_diff == 0:
return 1.0
base = math.log(abs(goal_diff) + 1)
# Upset bonus: if weaker team wins big
if (goal_diff > 0 and elo_diff < -50) or (goal_diff < 0 and elo_diff > 50):
upset_factor = 1.0 + abs(elo_diff) / 500
base *= upset_factor
return 0.7 + base * 0.5
This means:
- A 1-0 win: ~0.69× multiplier (less ELO change than standard)
- A 2-0 win: ~1.10× multiplier (slightly more)
- A 5-0 thrashing: ~1.79× multiplier (significantly more)
- A 5-0 upset by an underdog: up to ~2.07× multiplier
Home Advantage in the Rating System
When calculating ELO updates, we also account for home advantage. The home team’s rating is temporarily boosted by 100 points when calculating expected score. This means:
- If teams are equal on paper, the home team is “expected” to have ~64% win probability
- Beating a team at their home is worth more ELO than beating them away
- Losing at home costs more than losing away
Rebuilding History
With the formula defined, I rebuilt our entire ELO history from scratch using matches_data.json - our source of truth containing 1,965 Premier League matches from January 2020 to November 2025.
The Process
# Start all teams at 1500
team_elos = defaultdict(lambda: 1500)
# Process matches chronologically
for match in sorted(matches, key=lambda x: x['date']):
home_team = match['home_team']
away_team = match['away_team']
home_goals = match['home_goals']
away_goals = match['away_goals']
# Calculate ELO change with MOV
home_change, away_change = calculate_elo_change(
home_elo=team_elos[home_team],
away_elo=team_elos[away_team],
home_goals=home_goals,
away_goals=away_goals,
k_factor=20,
home_advantage=100
)
# Update ratings
team_elos[home_team] += home_change
team_elos[away_team] += away_change
# Record in history
record_elo_history(home_team, match['date'], team_elos[home_team])
record_elo_history(away_team, match['date'], team_elos[away_team])
Scaling to Match Expected Values
After processing all matches, our top team (Arsenal) had an ELO of 1753. Historical Premier League ELO systems typically have top teams around 2000-2050. To maintain consistency with expectations, I applied a +284 offset to all ratings.
This is purely cosmetic - the relative differences between teams remain identical. Arsenal at 1753 vs Liverpool at 1673 has the same predictive power as Arsenal at 2037 vs Liverpool at 1957.
Validation
Before and after comparison:
| Metric | Corrupted Data | Repaired Data |
|---|---|---|
| Swings > 50 pts | 659 | 0 |
| Swings > 100 pts | 412 | 0 |
| Largest swing | +266 | +30 |
| Arsenal trajectory | 1809↔2037 yo-yo | Smooth 1793→2037 |
The largest legitimate single-match swing in the repaired data is +30 points, which came from Spurs beating Man City 4-0 - exactly the kind of upset where you’d expect a big rating change.
The Hidden Problem: Venue-Blind Probabilities
With clean historical data, I moved to the next issue. Our elo_bands.json file contains probabilities like:
{
"band": 5,
"range": "201-250",
"stronger_win_pct": 0.6368,
"draw_pct": 0.1883,
"weaker_win_pct": 0.1749
}
But stronger_win_pct doesn’t distinguish between the stronger team playing at home versus away. It’s an average of both scenarios.
This is a problem. When we calculate fair odds for Arsenal (2037) vs Wolves (1668), we were using 63.68% for Arsenal regardless of venue. But Arsenal at the Emirates is very different from Arsenal at Molineux.
Quantifying the Venue Effect
I analysed all 1,949 matches in our dataset, splitting by whether the ELO-stronger team was home or away:
| Scenario | Sample Size | Win Rate |
|---|---|---|
| Stronger team HOME | 970 | 60.4% |
| Stronger team AWAY | 979 | 48.6% |
| Combined | 1,949 | 54.5% |
The stronger team wins 60.4% at home but only 48.6% away. That’s a 12 percentage point swing that our model was ignoring.
Band-by-Band Analysis
| Band | Stronger HOME Win% | Stronger AWAY Win% | Home Mult | Away Mult |
|---|---|---|---|---|
| 1 | 43.8% | 32.8% | 1.15 | 0.86 |
| 2 | 56.8% | 44.6% | 1.12 | 0.88 |
| 3 | 62.7% | 46.6% | 1.15 | 0.86 |
| 4 | 67.2% | 60.0% | 1.05 | 0.94 |
| 5 | 72.7% | 57.4% | 1.11 | 0.88 |
The pattern is consistent: home advantage adds roughly 10-15% to win probability.
The Adjustment Formula
Rather than rebuild elo_bands.json with separate home/away columns, I implemented a mathematical adjustment:
HOME_ADVANTAGE_MULTIPLIER = 1.11
AWAY_DISADVANTAGE_MULTIPLIER = 0.89
def adjust_probability_for_venue(base_prob, is_stronger_team_home, market):
"""
Adjust base probability based on venue.
"""
if market == 'stronger_win':
if is_stronger_team_home:
adjusted = base_prob * HOME_ADVANTAGE_MULTIPLIER
else:
adjusted = base_prob * AWAY_DISADVANTAGE_MULTIPLIER
elif market == 'weaker_win':
# Inverse: weaker team benefits when playing at home
if is_stronger_team_home:
adjusted = base_prob * AWAY_DISADVANTAGE_MULTIPLIER
else:
adjusted = base_prob * HOME_ADVANTAGE_MULTIPLIER
elif market == 'draw':
# Draws slightly more common when stronger team is away
if is_stronger_team_home:
adjusted = base_prob * 0.95
else:
adjusted = base_prob * 1.05
return min(0.99, max(0.01, adjusted))
After adjustment, probabilities are normalized to sum to 100%.
Real Example: Chelsea vs Everton (Matchweek 17)
Without venue adjustment:
- Band 2 stronger_win_pct = 45.84%
- Fair odds for Chelsea: 2.18
With venue adjustment (Chelsea HOME):
- Adjusted probability: 45.84% × 1.11 = 50.9%
- After normalization: 50.5%
- Fair odds: 1.98
If this were at Goodison (Chelsea AWAY):
- Adjusted probability: 45.84% × 0.89 = 40.8%
- After normalization: 41.1%
- Fair odds: 2.43
The venue changes fair odds from 1.98 to 2.43 - a 23% swing in implied probability. This matters enormously for identifying value.
Dynamic Home Advantage Calculation
The 1.11/0.89 multipliers were calculated from historical data, but home advantage isn’t static. It can evolve due to:
- Empty stadiums (COVID era)
- VAR implementation changing referee behaviour
- Tactical evolution (more teams set up to counter-attack away)
- Specific season effects
To keep the model current, I’ve added a function that recalculates these multipliers from the match database:
def calculate_home_advantage_multipliers(matches_data: List[Dict]) -> Dict[str, float]:
"""
Calculate home advantage multipliers from match data.
Returns:
{
'home_multiplier': 1.11,
'away_multiplier': 0.89,
'sample_size': 1949,
'last_updated': '2025-12-14'
}
"""
stronger_home = {'wins': 0, 'total': 0}
stronger_away = {'wins': 0, 'total': 0}
for match in matches_data:
home_elo = match.get('home_elo', 1500)
away_elo = match.get('away_elo', 1500)
# Skip corrupted entries
if home_elo == 1500 or away_elo == 1500:
continue
home_goals = match['home_goals']
away_goals = match['away_goals']
if home_elo >= away_elo:
# Stronger team is home
stronger_home['total'] += 1
if home_goals > away_goals:
stronger_home['wins'] += 1
else:
# Stronger team is away
stronger_away['total'] += 1
if away_goals > home_goals:
stronger_away['wins'] += 1
# Calculate win rates
home_win_rate = stronger_home['wins'] / stronger_home['total']
away_win_rate = stronger_away['wins'] / stronger_away['total']
combined_rate = (stronger_home['wins'] + stronger_away['wins']) / \
(stronger_home['total'] + stronger_away['total'])
# Calculate multipliers
home_mult = home_win_rate / combined_rate
away_mult = away_win_rate / combined_rate
return {
'home_multiplier': round(home_mult, 3),
'away_multiplier': round(away_mult, 3),
'home_win_rate': round(home_win_rate, 4),
'away_win_rate': round(away_win_rate, 4),
'sample_size': stronger_home['total'] + stronger_away['total'],
'last_updated': datetime.now().strftime('%Y-%m-%d')
}
Each week when new matches are added, this function recalculates the multipliers. The values will drift slightly as new data comes in, ensuring the model stays calibrated.
Putting It Together: Fair Odds Calculation
The complete fair odds calculation now works as follows:
def calculate_fair_odds(home_team_elo, away_team_elo, elo_bands):
"""
Calculate fair odds with venue adjustment.
"""
# Step 1: Calculate ELO difference and determine band
elo_diff = abs(home_team_elo - away_team_elo)
band_num = min(int(elo_diff // 50) + 1, 10)
band_data = get_band_data(band_num, elo_bands)
# Step 2: Determine if stronger team is home or away
is_stronger_home = home_team_elo >= away_team_elo
# Step 3: Get base probabilities from band
stronger_win_base = band_data['stronger_win_pct']
draw_base = band_data['draw_pct']
weaker_win_base = band_data['weaker_win_pct']
# Step 4: Apply venue adjustment
stronger_win_adj = adjust_probability_for_venue(
stronger_win_base, is_stronger_home, 'stronger_win'
)
weaker_win_adj = adjust_probability_for_venue(
weaker_win_base, is_stronger_home, 'weaker_win'
)
draw_adj = adjust_probability_for_venue(
draw_base, is_stronger_home, 'draw'
)
# Step 5: Normalize to sum to 1.0
total = stronger_win_adj + draw_adj + weaker_win_adj
stronger_win_adj /= total
draw_adj /= total
weaker_win_adj /= total
# Step 6: Map to home/away perspective
if is_stronger_home:
home_win_prob = stronger_win_adj
away_win_prob = weaker_win_adj
else:
home_win_prob = weaker_win_adj
away_win_prob = stronger_win_adj
# Step 7: Calculate fair odds
return {
'home_win': {
'probability': home_win_prob,
'fair_odds': round(1 / home_win_prob, 2)
},
'draw': {
'probability': draw_adj,
'fair_odds': round(1 / draw_adj, 2)
},
'away_win': {
'probability': away_win_prob,
'fair_odds': round(1 / away_win_prob, 2)
}
}
Impact on Matchweek 17
Let’s see how this affects our analysis for the upcoming fixtures:
Arsenal vs Wolves
Old method (no venue adjustment):
- Band 8: stronger_win_pct = 71.43%
- Arsenal fair odds: 1.40
New method (with venue adjustment):
- Arsenal HOME: 71.43% × 1.11 = 79.3% → normalized 75.1%
- Arsenal fair odds: 1.33
The bookmakers have Arsenal at 1.14. Our old model said that was negative EV (fair odds 1.40 vs book 1.14 = -18.6%). Our new model says it’s even more negative EV (fair odds 1.33 vs book 1.14 = -14.3%). Either way, avoid - but the new model is more accurate.
Chelsea vs Everton
Old method:
- Band 2: stronger_win_pct = 45.84%
- Chelsea fair odds: 2.18
New method:
- Chelsea HOME: 50.5%
- Chelsea fair odds: 1.98
Bookmakers have Chelsea at 1.61. Old model: -26.2% EV. New model: -18.7% EV. Still avoid, but the magnitude of the mistake is different.
For Everton to win (weaker team AWAY):
- Old model: 27.79% → fair odds 3.60
- New model: 24.6% → fair odds 4.07
- Bookmaker: 5.32
Old EV calculation: (0.2779 × 5.32) - 1 = +47.8% New EV calculation: (0.246 × 5.32) - 1 = +30.9%
Still strongly positive, still a recommended bet, but the edge is more accurately measured.
Real Examples: How the Adjustment Changes Our Bets
I placed six bets before implementing the venue adjustment. Let’s see how each one looks under the new model. Spoiler: five of them were on away teams.
Example 1: Villa @ West Ham — Still Valid, But Tighter
The bet: Aston Villa to Win @ 1.99
| Metric | Old Model | New Model | Change |
|---|---|---|---|
| Villa probability | 58.87% | 54.3% | -4.6% |
| Fair odds | 1.70 | 1.84 | +0.14 |
| EV | +17.1% | +8.1% | -9.0% |
Villa are the ELO-stronger team (1923 vs 1768, Band 4), but they’re playing away. The old model ignored this. The new model applies the 0.89 multiplier, reducing their win probability from 58.87% to 54.3%.
At 1.99 odds, the bet is still +EV (+8.1%), but the edge is roughly half what we thought. This is important for staking — Half-Kelly on +17.1% edge is very different from Half-Kelly on +8.1% edge.
Verdict: Still a valid bet, but stake should be smaller than originally calculated.
Example 2: Everton @ Chelsea — Large Margins Survive
The bet: Everton to Win @ 5.32
| Metric | Old Model | New Model | Change |
|---|---|---|---|
| Everton probability | 27.79% | 24.6% | -3.2% |
| Fair odds | 3.60 | 4.07 | +0.47 |
| EV | +47.8% | +31.0% | -16.8% |
Everton are the weaker team (1824 vs 1904, Band 2) playing away — double disadvantage. Their probability drops from 27.79% to 24.6%.
But look at the bookmaker odds: 5.32. That’s still massively overpriced compared to our new fair odds of 4.07. The EV drops from +47.8% to +31.0%, but +31% edge is still enormous.
Verdict: Large mispricings survive the adjustment. When the market is offering nearly 50% edge, even a 17% reduction leaves plenty of value.
Example 3: Brentford vs Leeds — Home Boost
The bet: Brentford to Win @ 1.95
| Metric | Old Model | New Model | Change |
|---|---|---|---|
| Brentford probability | 58.87% | 63.1% | +4.2% |
| Fair odds | 1.70 | 1.58 | -0.12 |
| EV | +14.8% | +23.4% | +8.6% |
This is the only home team bet in my portfolio, and look what happens: the EV increases by 8.6 percentage points.
Brentford are stronger (1844 vs 1693, Band 4) AND playing at home. The 1.11 multiplier boosts their probability from 58.87% to 63.1%. At 1.95 odds, the edge jumps from +14.8% to +23.4%.
Verdict: Home team bets were being undervalued by the old model. The adjustment reveals even better value than we thought.
The Trap Zone: Where Marginal Bets Become Losers
The most dangerous effect of the venue adjustment isn’t on the big edges — it’s on the marginal ones.
Consider this scenario. You’re looking at a Band 2 fixture where the stronger team is away:
| Calculation | Old Model | New Model |
|---|---|---|
| Base probability | 45.84% | 45.84% |
| Venue adjustment | None | × 0.89 |
| Adjusted probability | 45.84% | 41.1% |
| Fair odds | 2.18 | 2.43 |
The Trap Zone is bookmaker odds between 2.18 and 2.43.
If the bookmaker offers 2.30:
- Old model says: Fair odds 2.18 vs book 2.30 = +5.5% EV ✅ (bet!)
- New model says: Fair odds 2.43 vs book 2.30 = -5.4% EV ❌ (avoid!)
The old model would flag this as a value bet. The new model correctly identifies it as a trap.
Real Example: Forest vs Spurs
Our actual bet was Spurs @ 2.77, which shows:
- Old EV: +27.1%
- New EV: +14.0%
Still comfortably positive. But imagine if Spurs were priced at 2.30 instead of 2.77. The old model would have said “take it” while the new model would have said “avoid.” That’s the trap zone in action.
The Safety Margin
To maintain a genuine +5% edge after venue adjustment, away favorites now need approximately +18% edge in the old model. This provides a safety margin:
| Old Model Edge | New Model Edge (Away) | Verdict |
|---|---|---|
| +5% | -6% | ❌ Trap |
| +10% | -1% | ❌ Trap |
| +15% | +4% | ⚠️ Marginal |
| +18% | +7% | ✅ Valid |
| +25% | +14% | ✅ Strong |
The Asymmetry Problem
Here’s the uncomfortable truth from my Matchweek 17 portfolio:
| Bet | Venue | Old EV | New EV | Change |
|---|---|---|---|---|
| Villa @ West Ham | Away | +17.1% | +8.1% | -9.0% |
| Spurs @ Forest | Away | +27.1% | +14.0% | -13.1% |
| Everton @ Chelsea | Away | +47.8% | +31.0% | -16.8% |
| Bournemouth @ Man Utd | Away | +40.1% | +26.5% | -13.5% |
| Brighton @ Liverpool | Away | +24.5% | +12.8% | -11.7% |
| Brentford vs Leeds | Home | +14.8% | +23.4% | +8.6% |
Five away bets decreased in EV. One home bet increased.
This isn’t a coincidence — my old model was systematically overvaluing away teams because it ignored the venue penalty. The market wasn’t as wrong as I thought; I was just using the wrong probabilities.
The silver lining: All six bets remain +EV after adjustment. But had any of them been in the trap zone (old EV between +5% and +18%), I would have been making -EV bets while thinking I had an edge. That’s how you lose a bankroll.
Files Updated
| File | Change |
|---|---|
elo_calculator.py |
New module with independent ELO calculation, MOV multiplier, venue adjustment, and dynamic multiplier calculation |
elo_history.json |
Rebuilt from scratch - 5 years of clean data |
current_elo.json |
Updated with scaled ratings |
integration_guide.py |
Instructions for updating main script |
Philosophy: Why This Matters
Sports betting is a game of marginal edges. Bookmakers employ teams of quantitative analysts, access to private data, and sophisticated models. Our edge comes from discipline, transparency, and continuous improvement.
The venue adjustment we’ve implemented isn’t novel - bookmakers certainly account for home advantage. But by quantifying it precisely from our own data, we can:
- Verify our assumptions rather than guessing
- Track drift over time as home advantage evolves
- Explain our methodology transparently
- Identify when we’re wrong by comparing predictions to outcomes
The corrupted ClubELO data could have cost us significantly. A model showing Spurs at +309 ELO change would have produced nonsensical fair odds. By building our own system, we control the entire pipeline from raw match data to betting recommendations.
What’s Next
With clean historical data and accurate venue adjustment, the model is in its strongest state yet. Upcoming improvements on the roadmap:
- Rest day adjustment: Teams playing after extended rest vs fixture congestion
- Key player impact: Adjusting probabilities when star players are injured
- Referee tendencies: Integrating ref stats for cards/goals markets
- Weather data: Some teams perform differently in certain conditions
For now, the fundamentals are solid. Trust the process.
Model performance is tracked publicly. All probabilities are calculated using 1,965 Premier League matches (2020-2025). Past results do not guarantee future performance. Gamble responsibly.