// 01
The client and the question
A US-based marketing entrepreneur — someone we'd worked with before — came to us with a question rather than a spec. Sports betting is legal in much of the US, and he wanted to know: could you build a model that predicts tennis matches accurately enough to bet on profitably? He wanted to invest a little, find out if it was even possible, and only then decide whether to commit real money to the idea or the build.
That framing shaped everything. This wasn't "build me a platform." It was "tell me if this is worth building at all" — and that's a very different, and in many ways harder, kind of engagement.
// 02
The challenge
Predicting tennis is deceptively hard. The obvious approach — always pick the higher-ranked player — gets you only about 65–66% accuracy, and here's the catch: that's a losing strategy. A rank-only approach produces negative betting ROI, because ranking ignores everything that actually decides close matches. The world #1 isn't necessarily the favorite on clay; a #3 who owns the surface, or has the head-to-head, or is in better form, can be the smarter bet. Beating the naive baseline by even a few points is the entire game — in betting, the margin between ~65% and ~70% is the margin between losing money and making it.
So the real challenge wasn't "predict matches." It was: find the handful of factors that genuinely move the needle, separate them from the noise, and build a model that beats the baseline by enough to matter — cheaply enough that the client could decide whether to go further.
// 03
What we did
Step 1 — A proof of concept, in Excel. Before writing a line of production code, we de-risked the idea the cheap way. We pulled several years of historical match records into a spreadsheet and started experimenting — running algorithms, testing which factors actually predicted outcomes versus which were hype. We brought in a statistical consultant (from EY) to pressure-test our approach. The goal was a single yes/no: is there a real, learnable signal here? The PoC answered it — we reached 55%+ accuracy on historical data, enough to prove the concept had legs. Only then did the client commit to building.
Step 2 — Refine the model, not just the data. Moving from PoC to production, we didn't just throw more years at it (though we did pull every ATP/WTA match record going back to 1981). We substantially refined the model itself — capturing more predictive factors: player rankings, head-to-head history, surface, recent form, fatigue, and recent performance, weighted by what the data showed actually mattered. The data came from a match-data API provider, but their records only went back seven years — so we scraped the official ATP and WTA archives for everything older.
Step 3 — A deliberately simple platform. The product itself was intentionally lean — a landing page plus password-protected prediction pages. Each day it listed the upcoming ATP/WTA matches for the next seven days, with our predicted winner and a confidence level for each. Built on Node and React, batch-predicted. No bloat — the value was in the model, and the platform's job was just to deliver it clearly.
// 04
My role
I was the delivery manager, leading a small team of two — a developer and a QA. The engagement was as much about judgment as code: knowing to validate cheaply in Excel before building, knowing when to bring in statistical expertise, and knowing that the simplest possible platform was the right call once the model was the real asset.
// 05
The outcome
- →
69.8% accuracy, measured over six months of live matches — not backtested on historical data, but validated forward against real, upcoming games as they happened.
- →
Beating the ~65–66% naive baseline by the ~4–5 point margin that matters — crossing the line from a losing rank-only strategy to one accurate enough for the client's betting use case.
- →
A validated idea, built the right way — proven cheap before it was built expensive.
The platform ran for about two years. Ultimately the client wound it down — not because the model didn't work, but because the broader business didn't hit the financial returns he was after. The prediction engine did exactly what it was asked to do.