自动聚类用于无监督的风险诊断智能道路的车辆驾驶

论文标题

自动聚类用于无监督的风险诊断智能道路的车辆驾驶

Automatic Clustering for Unsupervised Risk Diagnosis of Vehicle Driving for Smart Road

论文作者

Shi, Xiupeng, Wong, Yiik Diew, Chai, Chen, Li, Michael Zhi-Feng, Chen, Tianyi, Zeng, Zeng

论文摘要

尽管存在内在的挑战，尤其是缺乏地面真理，对多种风险暴露的定义，但在一系列高级解决方案方面，从车辆流进行的早期风险诊断和推动异常检测是很大的好处。这项研究提出了一个特定领域的自动聚类（称为自动群集），以自学习无监督的风险评估的最佳模型，该模型将风险聚类的关键步骤集成到可自动化的管道中，包括功能和算法选择，超参数自动调节。首先，基于替代冲突措施，进行指标引导的特征提取以构建时间空间和运动学风险特征。然后，我们开发一种基于消除的模型依赖重要性（EMRI）方法来无监督选择有用的功能。其次，我们提出平衡的轮廓指数（BSI）来评估不平衡聚类的内部质量。设计损失功能，以内部质量，聚类间变化和模型稳定性来考虑聚类性能。第三，基于贝叶斯优化，算法选择和超参数自动调整是自学的，以生成最佳的聚类分区。全面研究了各种算法。在此，NGSIM车辆轨迹数据用于测试层。调查结果表明，自动群体是可靠的，并且有望诊断广义驾驶行为固有的多种不同风险暴露。此外，我们还深入研究了风险聚类，例如算法异质性，轮廓分析，分层聚类流量等。与此同时，自动群集也是一种无人监督的多风险数据标记和指示器阈值级别的方法。此外，自动群体对于无需地面事实或先验知识而在不平衡聚类中的挑战很有用

Early risk diagnosis and driving anomaly detection from vehicle stream are of great benefits in a range of advanced solutions towards Smart Road and crash prevention, although there are intrinsic challenges, especially lack of ground truth, definition of multiple risk exposures. This study proposes a domain-specific automatic clustering (termed Autocluster) to self-learn the optimal models for unsupervised risk assessment, which integrates key steps of risk clustering into an auto-optimisable pipeline, including feature and algorithm selection, hyperparameter auto-tuning. Firstly, based on surrogate conflict measures, indicator-guided feature extraction is conducted to construct temporal-spatial and kinematical risk features. Then we develop an elimination-based model reliance importance (EMRI) method to unsupervised-select the useful features. Secondly, we propose balanced Silhouette Index (bSI) to evaluate the internal quality of imbalanced clustering. A loss function is designed that considers the clustering performance in terms of internal quality, inter-cluster variation, and model stability. Thirdly, based on Bayesian optimisation, the algorithm selection and hyperparameter auto-tuning are self-learned to generate the best clustering partitions. Various algorithms are comprehensively investigated. Herein, NGSIM vehicle trajectory data is used for test-bedding. Findings show that Autocluster is reliable and promising to diagnose multiple distinct risk exposures inherent to generalised driving behaviour. Besides, we also delve into risk clustering, such as, algorithms heterogeneity, Silhouette analysis, hierarchical clustering flows, etc. Meanwhile, the Autocluster is also a method for unsupervised multi-risk data labelling and indicator threshold calibration. Furthermore, Autocluster is useful to tackle the challenges in imbalanced clustering without ground truth or priori knowledge

下载PDF全文

下载文献需遵守相关版权规定

论文标题