论文标题
通过可解释的机器学习和凯利指数来预测足球比赛的结果
Predicting Football Match Outcomes with eXplainable Machine Learning and the Kelly Index
论文作者
论文摘要
在这项工作中,开发了一种机器学习方法来预测足球比赛的结果。这项研究的新颖性在于利用凯利指数将匹配首先分为类别,其中每个匹配都表示不同的预测难度水平。为每种匹配类别开发了使用广泛的算法套件的分类模型,以确定该方法的功效。结合使用,一组以前未开发的功能是工程学,包括基于ELO的变量。 该数据集源自涵盖2019-2021赛季的英超联赛比赛数据。研究结果表明,将预测问题分解为子任务的过程是有效的,并且通过先前的工作产生了竞争结果,而基于合奏的方法是最有效的。 该论文还制定了一种投资策略,以通过基准针对博彩公司的赔率来评估其有效性。通过将凯利指数与预测模型的预定义置信阈值相结合,开发了一种方法,可以最大程度地降低风险。实验发现,遵循的保守方法主要集中于易于预测的匹配,而预测模型表现出较高的置信水平。
In this work, a machine learning approach is developed for predicting the outcomes of football matches. The novelty of this research lies in the utilisation of the Kelly Index to first classify matches into categories where each one denotes the different levels of predictive difficulty. Classification models using a wide suite of algorithms were developed for each category of matches in order to determine the efficacy of the approach. In conjunction to this, a set of previously unexplored features were engineering including Elo-based variables. The dataset originated from the Premier League match data covering the 2019-2021 seasons. The findings indicate that the process of decomposing the predictive problem into sub-tasks was effective and produced competitive results with prior works, while the ensemble-based methods were the most effective. The paper also devised an investment strategy in order to evaluate its effectiveness by benchmarking against bookmaker odds. An approach was developed that minimises risk by combining the Kelly Index with the predefined confidence thresholds of the predictive models. The experiments found that the proposed strategy can return a profit when following a conservative approach that focuses primarily on easy-to-predict matches where the predictive models display a high confidence level.