Wednesday, 26 July 2017

How are Lawrenson and Merson beating the market?

For the last six seasons, ex-Liverpool player and regular BBC pundit Mark (‘Lawro’) Lawrenson has been attempting to predict the outcome of every EPL match. Over on Sky Sports, ex-Arsenal player Paul Merson has been doing the same thing. Their predictions are published on the BBC Sport and Sky Sports websites the week before each match. But how good are they?

Last season I performed a little experiment to assess the performance of the pundit’s predictions against a clear baseline: the betting market. I placed a £1 bet on each predicted outcome – home win, away win or draw – made by Merson and Lawro, selecting the best odds offered for that outcome by one of four bookmakers[1]. Over the course of the season that amounted to 777 bets: 379 on Lawro’s predictions and 378 on Merson’s.

I did surprisingly well out of my little experiment. By the end of the season, Lawro was £63 in profit and Merson £52. This amounts to a return on investment of 17% and 14% respectively – significantly above the 9% I would have made had I just invested in the FTSE100 share index over the same period.

It was not a one-off either. Figure 1 demonstrates that if you retrospectively run the same experiment using all their publicly available predictions – back to the 2011/12 season for Lawro and the 2014/15 season for Merson – you find that both pundits have consistently made a positive return: 8% per match on average (an equivalent plot can be found here for Merson). As the null distribution in Figure 1 indicates[2], it is unlikely that Lawrenson could have performed this well purely by chance (the probability of him having done so is about 1/500). In investment parlance, the pundits have achieved an annualised return-per-unit-risk (Sharpe Ratio) of 1.2, which is really good.

Figure 1: The cumulative profit and loss (P&L) generated from betting £1 on each of Mark Lawrenson’s 2240 match predictions over the last 6 EPL seasons. The right hand bar shows the distribution of the P&L obtained if you were to bet randomly over the same period: there is only a 1/500 chance that you would exceed Lawro’s profit. 


To my knowledge, neither Lawro nor Merson has professed to adopting any kind of system for making a prediction. Instead they rely on instinct and their vast experience in English football. Each week, they provide a few sentences explaining their predictions, summarizing recent form, injuries, suspensions and the importance of the match to each team. So while they do not have a systematic, data-driven approach to predicting football, they may intuitively incorporate some of the intangible factors that statistical models do not, or perhaps cannot, account for. 

In the remainder of this blog I’m going to take a closer look at pundit's forecasting success, and identify how they have been able to beat the market. In a subsequent blog, I’m going to look how we might improve on their predictions.

A more detailed look at the pundit’s forecasts


Table 1 shows the rate at which Lawro and Merson predict home wins, away wins and draws compared to the frequency at which each outcome has occurred in practice. For example, in Mark Lawrenson’s 2240 match forecasts he has predicted a home win in 56%, an away win in 19% and a draw in the remaining 25%. Over the same period, these outcomes have actually occurred at a rate of 45%, 30% and 25%, respectively. So Lawrenson overestimates the frequency of home wins and underestimates the frequency of away wins, but predicts draws at the correct rate. Pretty much the same applies to Merson.

Table 1: Proportion of home win, away win and draw predictions, and the rate they occur in practice.

The final column of Table 1 shows the bookie’s favoured outcome, that is, the outcome with the lowest odds. It’s striking that they never predict a draw as the most likely outcome: in 70% of matches they favour a home win and the remaining 30% an away win. 

This isn’t actually all that surprising: draws are difficult to predict. A delicate compromise between a home win and an away win, they are inherently unstable results – a single goal for either team breaks the deadlock. In my experience, statistical forecasting models (which I assume are largely what drives bookmakers odds) have a lot of difficulty in predicting a draw as the most likely outcome; they nearly always assign a higher probability to one team outscoring the other than they do to a tie[3]. This can be seen explicitly in the odds offered by the bookmakers on draws: rarely below 3.0 or greater than 5.0, corresponding to a narrow range in probability of just 20% to 33%[4].

Table 2 shows the pundit’s success rate: the proportion of their predictions that were correct. Both Lawrenson and Merson have a success rate of 52%, they predict the correct outcome in just over half of the matches. Breaking this down we see that just under 60% of their home win predictions and around 55% of their away win predictions are correct, but only 34% of their draw predictions are right. However, while that may seem low, it does not imply a lack of skill. Only a quarter of all matches end in a draw but the pundits have a higher success rate than that – they are correct a third of the time. Even though their success rate seems low for draws, they are definitely doing better than just randomly predicting them.

Table 2: Proportion of correct predictions for pundits and bookmakers.

By this metric the bookmakers outperform the pundits. Their favoured outcome is realized in 54% of matches, a higher success rate than the pundit’s 52%. However, the bookmakers never favour a draw and so are not penalized by the low success rate that arises from trying to predict them. Furthermore, when you look solely at either home win or away win predictions, both pundits have a higher success rate. 

I’m not suggesting that the bookmaker’s odds on draws are wrong. Indeed it’s straightforward to show that their odds are quite well calibrated, as this plot demonstrates. It’s just that they seem unable  (or unwilling?) to identify when a draw might be the most likely outcome (for instance, when both teams would settle for a draw). Is this then where the pundit’s edge lies?

So how do Lawro & Merson make their money?


Table 3 shows their average return per game, quoted as the percentage profit on their initial £1 bet in each match (i.e., not including the return of their initial bet when they are correct). 

Both pundits have made an average profit of 8%. Look more closely and you see where they are making their profits. When Mark Lawrenson predicts a home win he is correct 58% of the time. However, his average return on these predictions is only 1%. So when he bets £1 on a home win he might expect – based on past performance – to make a profit of a penny. For his away win predictions (for which he has a 57% success rate) he could expect to make a 10 pence profit. But when he predicts a draw, he would expect a 20 pence profit – despite only winning his bet 33% of the time! The numbers are a little difference for Merson, but the story is pretty much of the same: the pundit’s most profitable predictions are draws.

Table 3: Pundit’s betting returns and the average odds they received for their correct predictions.

This is, of course, driven by the different odds offered on each outcome. The lower part of Table 3 shows the average (decimal) odds that were offered on the pundit’s winning predictions, i.e. those in which they won their bet. 

The average odds offered on their correct home and away predictions were 1.8 and 1.9 respectively, which corresponds to an average profit of £0.80 and £0.90 (based on a £1 bet).  So, on average, they gain less when they win an individual bet than when they lose one; fortunately, as Table 3 shows, they win them sufficiently often to make a profit[5]

For their correct draw predictions, they receive average odds of 3.5 – this is a profit of £2.50, which is nearly triple the return on their correct home and away win bets! The markets are underestimating the likelihood of a draw in those games, allowing the pundits to make a decent profit. Infact, the majority of the pundit’s profits are generated by the edge they have over the market in predicting draws. 

We can also see this if we look at each pundit's top-20 most profitable predictions. All but one (95%) of Merson’s top-20 are draws, and 14 out of 20 (70%) of Lawrenson’s are draws. Around two-thirds of these matches also involved a top-4 team, typically a smaller team gaining a draw against one of the big fish (such as QPR vs Man City in 2014/15, or Sunderland vs Arsenal in 2015/16).

Summary


The take-home message from this blog is that, despite their success rate of only 1 in 3 in predicting them, draws are the most profitable predictions made by both pundits. This is because they are better at predicting draws than the bookmakers. The odds offered by the market rarely differ substantially from the basic rate at which draws occur – about 1 in every 4 matches. I suspect that significant improvements could be made to statistical models for predicting the outcome of football matches by identifying the information the pundits are homing in on when they predict a draw. 

So far I’ve treated Lawro and Merson separately. What happens if we construct a consensus forecast, betting on only those matches for which their predictions agree? Is their combined prediction power better than their individual efforts? It turns out the answer is yes, but that’s the topic for my next blog... 


Note: This post is an analysis of past pundit predictions and past performance is not indicative of future results. Do not bet with the expectation of making a profit: you may lose significantly more than you win.

--------

[1] Bet365, BetVictor, Ladbrokes and William Hill
[2] To generate the null distribution, I reran the full 6-year experiment 10,000 times, randomly assigning a prediction of home win, away win or draw for every game. The rate of home wins, away wins and draws in each season were fixed to be the same as in Lawrenson’s forecasts; I basically just shuffled his forecasts around between matches.
[3] Especially if the model assumes independent Poisson distributions for each. Bivariate distributions with non-zero correlation may rectify this.
[4] The inferred percentage probability of an outcome given the odds, is just 100/(decimal odds). You also need correct for the over-round -- the edge the bookmakers give themselves that result in the inferred outcome probabilities summing to greater than 100%. 
[5] If the pundit’s success rate for their home (away) win predictions dropped below 56% (53%) they would make a loss on them.