论文标题
评估机器学习算法在极端天气下的近时间公交车乘车预测
Assessing Machine Learning Algorithms for Near-Real Time Bus Ridership Prediction During Extreme Weather
论文作者
论文摘要
鉴于气候越来越大,天气与过境乘车率之间的关系引起了人们的兴趣。然而,在天气条件的影响下,特别是在传统的统计方法的影响下,在建模和预测过境乘车方面,尚未完全解决源于时空依赖性和非平稳性的挑战。这项研究利用澳大利亚布里斯班的三个月智能卡数据采用并评估了一套机器学习算法,即随机森林,极端梯度增强(XGBoost)和Tweedie Xgboost,以建模并预测接近实时的公交乘客在天气情况突然变化的情况下实时的乘客。该研究证实,确实存在着大量的天气关系关系的时空变异性,这会产生同样动态的预测误差模式。对模型性能的进一步比较表明,Tweedie Xgboost在生成时空和时间上更准确的预测结果方面优于其他两种机器学习算法。未来的研究可能通过利用较大的数据集并应用更先进的机器和深度学习方法来推动当前的研究,从而为运输系统的实时操作提供更高的证据。
Given an increasingly volatile climate, the relationship between weather and transit ridership has drawn increasing interest. However, challenges stemming from spatio-temporal dependency and non-stationarity have not been fully addressed in modelling and predicting transit ridership under the influence of weather conditions especially with the traditional statistical approaches. Drawing on three-month smart card data in Brisbane, Australia, this research adopts and assesses a suite of machine-learning algorithms, i.e., random forest, eXtreme Gradient Boosting (XGBoost) and Tweedie XGBoost, to model and predict near real-time bus ridership in relation to sudden change of weather conditions. The study confirms that there indeed exists a significant level of spatio-temporal variability of weather-ridership relationship, which produces equally dynamic patterns of prediction errors. Further comparison of model performance suggests that Tweedie XGBoost outperforms the other two machine-learning algorithms in generating overall more accurate prediction outcomes in space and time. Future research may advance the current study by drawing on larger data sets and applying more advanced machine and deep-learning approaches to provide more enhanced evidence for real-time operation of transit systems.