Wednesday, November 5, 2008

Evaluating a Betting System's Performance (Part One)

It's been quite a while since I last had a blog entry which was not a system update so here is (hopefully) some food for thought.

Surfing the Internet, visiting various betting websites and forums (or should I say fora), I've stumbled across countless betting systems and strategies. These systems consist of not only the tipping part but also different, and sometimes quite elaborate staking plans. An obvious question which arises is "How does one compare these systems?"


I have always been a firm believer that forecasting / prediction models, irrespective of the way their predictions are formulated, should be monitored, evaluated and adjusted; in other words they should be judged. Only by judging a forecasting tool one can hope to improve it. What I am not exactly sure, and please if you have any views do share them, is the way this should be done. You see, everybody looks at the yield and of course this is a measure of how good the system is with this particular set of results. But it's also important to acknowledge that part of a system's performance is due to luck. In other words, if it was possible in a way to re-create all matches in a season, with the same parameters as at the time of the match, what would the system result in, given that not all matches would end up with the same outcome? Is there a way to utilize statistical theory in such a way to account for this uncertainty an create a more robust measure than yield itself? The million dollar question, I guess.

Internet research on this issue leads quite often to the notion of χ2-testing. The idea behind this test is to compare, with some sort of statistical confidence, the number of a system's actual (or observed) winners to the corresponding expected number. For example 15 bets at even odds would yield an expected number of 7.5 winners, but the actual number of winning bets might be 9. Is this sufficient evidence to suggest that the system is performing better than what would be expected by sheer chance? What if the actual winners were 10? 12? All 15?

Well, the χ2-test provides a way to measure this by looking at the probability of achieving the observed result (i.e. the number of actual winners) or a better result by chance. If for example there is a 30% probability of hitting 9 winners or more (given the odds at which the bets were taken) out of 15 simply by chance then the system is not performing necessarily well. As soon however, as this figure is reduced, it means that the system is hitting the winners not necessarily by chance but rather by exploiting value through mis-priced matches. Anyone who has studied any introductory statistics course will tell you that the figure that statisticians are interested in, when testing hypotheses, is the 5% so in effect, according to this test, one would be happy if the χ2-test returned a probability less than 5%.

My personal opinion is that the χ2-test is not necessarily a good indicator of whether the system is performing well. All it does is to check whether the number of winners is significantly higher than what one would expect by sheer chance, but it does not consider the odds at which these winners were picked. What if a system's winners came at very low odds, whereas the system underperformed in the high range of odds. This could lead to a case of achieving a high (as compared to the expected) level of winners, yet the yield could be very low or even negative! Some consolation, I hear you say ... Of course the opposite could also be true: a few longshot winners could be enough to lead to a positive yield yet the number of correct picks could be below the expected number. [For the doubters amongst you that such a case is not as implausible as it may sound, feel free to check the relevant discussion on Punters' Paradise forum.]

The search must carry on. In Part Two of this post (to be posted some time in the near future), I'll look into different ways of assessing whether one has a profitable system and to evaluate its performance. Until then take care!

0 comments: