Home / Guides FAQ / Historical Odds & Backtesting

Historical Odds Data
for Backtesting

Timestamped odds snapshots from 50 sportsbooks across 22 sports. Reconstruct what prices were available at any point in time. Validate your model before risking capital.

22

Sports archived

50

Books per snapshot

380K+

Odds rows collected

Get API Key — Business Tier

Historical data available on Business plan ($99/mo)

Sample — /historical/odds
{
  "sport_key": "basketball_nba",
  "event": "Knicks vs Heat",
  "commence_time": "2026-04-28T23:30:00Z",
  "bookmakers": [
    {
      "key": "pinnacle",
      "markets": [{
        "key": "h2h",
        "outcomes": [
          {"name": "Knicks", "price": -135},
          {"name": "Heat", "price": +122}
        ]
      }]
    },
    {
      "key": "draftkings",
      "markets": [{
        "key": "h2h",
        "outcomes": [
          {"name": "Knicks", "price": -140},
          {"name": "Heat", "price": +120}
        ]
      }]
    }
  ],
  "snapshot_time": "2026-04-28T20:15:00Z"
}

Why Backtesting Against Real Odds Matters

Problem

Opening Lines Lie

Opening lines move 2–5 points before game time as sharp money arrives. Backtesting against openers overstates your edge because those prices were gone before your model triggered.

Problem

Single-Book Overfitting

Testing against one book's history overfits to that book's pricing patterns. A model that "works" on DraftKings data may fail entirely on the broader market. Multi-book data exposes this immediately.

Solution

Timestamped Snapshots

TheOddsAPI archives odds with timestamps from 50 books. Simulate execution at the price that was actually available when your model signaled. Realistic backtests, realistic expectations.

Closing Line Value (CLV)

The single strongest predictor of long-term profitability. If you consistently beat the closing line, you have an edge. Period.

Example: NBA — Knicks vs Heat

Your Bet (3hrs pre-game)

Knicks -130

Implied: 56.5%

Closing Line

Knicks -145

Implied: 59.2%

Your CLV

+2.7%

You beat the close

You bet Knicks -130 three hours before game time. The line closed at -145. That means the market agreed your bet was underpriced — you got 2.7% more value than the final efficient price. Track this over 1,000+ bets. Positive CLV = edge confirmed. Negative CLV = your model is behind the market.

What's in the Archive

Multi-bookmaker historical odds. Not a single-source feed.

Dimension Coverage
Sports 22 (NBA, MLB, NHL, NFL, EPL, La Liga, Bundesliga, Serie A, Champions League, and 13 more)
Markets h2h (moneyline), spreads, totals
Sportsbooks 50 per snapshot (DraftKings, FanDuel, Pinnacle, Betfair, BetMGM, and 45 more)
Data start April 16, 2026 (expanding daily)
Snapshot frequency Timestamped with each odds refresh
Credit cost Zero additional — included in Business tier

Integration

Pull historical snapshots and calculate CLV in a few lines.

Python — Backtest CLV
import requests

# Pull historical odds for NBA
response = requests.get(
    "https://api.theoddsapi.com/v4/historical/odds",
    headers={"x-api-key": "YOUR_KEY"},
    params={
        "sport_key": "basketball_nba",
        "date": "2026-04-28",
        "bookmakers": "pinnacle,draftkings,fanduel"
    }
)

events = response.json()["data"]

# Calculate CLV for each event
for event in events:
    # Compare your bet price vs closing snapshot
    my_bet_price = get_my_bet(event["event_id"])  # your records
    closing_price = event["bookmakers"][0]["markets"][0]["outcomes"][0]["price"]

    # Positive CLV = you beat the market
    clv = implied_prob(closing_price) - implied_prob(my_bet_price)
    print(f"{event['event']}: CLV {clv:+.1%}")
JavaScript — Fetch Historical Odds
const res = await fetch(
  "https://api.theoddsapi.com/v4/historical/odds?sport_key=basketball_nba&date=2026-04-28",
  { headers: { "x-api-key": "YOUR_KEY" } }
);

const { data: events } = await res.json();

// Reconstruct what odds were available at any timestamp
events.forEach(event => {
  const pinnacle = event.bookmakers.find(b => b.key === "pinnacle");
  const dk = event.bookmakers.find(b => b.key === "draftkings");
  console.log(`${event.event}: Pinnacle ${pinnacle?.markets[0].outcomes[0].price} vs DK ${dk?.markets[0].outcomes[0].price}`);
});

Backtesting Methodology

A rigorous backtest answers one question: would your model have made money if executed at prices that were actually available?

1

Define Your Signal

ML model, consensus fade, sharp-follows, or any system that generates a bet decision with a timestamp. The timestamp is critical — it determines which historical snapshot to compare against.

2

Pull Odds at Decision Time

Query /historical/odds for the snapshot closest to when your model would have triggered. This gives you the actual price available — not an opening or closing approximation.

3

Simulate Execution

For each signal, record the best available price across 50 books at that timestamp. Apply your staking strategy (flat, Kelly, fractional Kelly). Account for vig — don't backtest against mid-market prices you can't actually get.

4

Measure CLV & ROI

Compare your execution price against the closing line. Calculate CLV across all bets. Positive CLV over 500+ bets = confirmed edge. Track ROI, drawdown, and Sharpe ratio to determine if it's capital-deployable.

5

Out-of-Sample Validation

Split your data. Train on one period, test on another. If CLV disappears out-of-sample, your model overfit the training data. This is where most "winning" systems fail.

Common Backtesting Pitfalls

Survivorship Bias

Only testing on events that completed normally. Cancelled games, postponements, and voided markets get excluded — inflating your results. Include all events in your sample.

Ignoring Execution Lag

Your model signals at T=0, but you execute at T=30s. In that window, the line may have moved. Use the snapshot closest to your realistic execution time, not the signal time.

Vig-Free Backtesting

Testing against mid-market or vig-removed prices that you can't actually bet. Always backtest against the real offered price from specific books. Multi-book data lets you find the best available line — that's your realistic execution price.

Data Snooping

Running 50 model variations and picking the one that worked best. Each parameter you tune increases overfitting risk. Out-of-sample testing on held-out data is mandatory, not optional.

What Operators Build with Historical Data

CLV Tracking Systems

Compare every bet placed against the closing line across all books. Automated CLV calculation tells you if your edge is real or if you're just getting lucky in small samples.

Model Validation Pipelines

Before deploying a new model live, backtest against 50-book historical data. If it shows +CLV against Pinnacle closing lines over 500+ events, deploy. If not, iterate without burning capital.

Arbitrage Strategy Replay

Reconstruct historical cross-book disagreements to measure how often arbs appeared, how long they lasted, and what profit range was realistic. Size your arb operation based on empirical data, not assumptions.

Market Efficiency Research

Study how quickly soft books converge to Pinnacle's line. Measure the half-life of edges by sport, market, and time-to-kickoff. Build timing models that tell you when to bet for maximum CLV.

Related

Start Backtesting

Historical odds are available on the Business plan ($99/mo). Zero additional credit cost. 22 sports. 50 books per snapshot.