← Back

20 April 2026

What makes a football statistic actually remarkable?

Not all statistics are created equal. Here's how we think about separating the genuinely surprising from the statistically mundane — and why it's harder than it looks.

"Arsenal have won their last five home games" is a statistic. So is "in 847 Premier League matches where the home side led at half-time on a Tuesday in February, the lead was held 79.3% of the time". Both are true. Only one is interesting.

The difference is prior probability: how surprising is this outcome, given what we'd expect? A strong team winning at home is only mildly surprising. A highly specific conditional pattern holding for years is genuinely strange.

The base rate problem

Most football statistics suffer from cherry-picking. You can always find a sequence that looks impressive if you're allowed to choose the conditions after seeing the data. "Liverpool have scored in every match where it rained and the referee was from Lancashire" is not a meaningful finding — it's a coincidence dressed up in a suit.

Good statistical analysis holds out a prior estimate of how likely the pattern is before you look. Then the length of the streak can be judged against that baseline.

What we look for

Our scoring model rewards two things:

Rarity of the outcome. A team that doesn't lose is more remarkable than a team that scores. A team that scores three or more is more remarkable than a team that scores one. Each outcome has an estimated base rate drawn from historical data, and rarer outcomes score higher.

Independence of conditions. A streak that combines two independently unusual things — a specific player, a specific venue condition — is worth more than one that combines two things that almost always go together. We prune combinations where one condition subsumes the other.

Why length isn't everything

A streak of 100 matches in a trivially easy condition (like "didn't lose at home as a title-winning team") is less impressive than a streak of 20 matches in a rare condition. Pure length rankings produce boring results. Score-weighted rankings surface the genuinely strange.

This is the philosophy behind the Streaks page — not the longest runs, but the most improbable ones.