有监督功能排名算法的新型评估方法

论文标题

有监督功能排名算法的新型评估方法

A novel evaluation methodology for supervised Feature Ranking algorithms

论文作者

Overschie, Jeroen G. S.

论文摘要

无论是在功能选择的领域还是可解释的AI领域，都有根据其重要性来“排名”特征的愿望。然后可以将这种特征重要的排名用于：（1）减小数据集大小或（2）解释机器学习模型。但是，在文献中，这种特征排名没有以系统的，一致的方式评估。许多论文都有不同的方式来争论哪些重要性排名最佳。本文通过提出一种新的评估方法来填补这一空白。通过使用合成数据集，可以事先知道特征重要性得分，从而可以进行更系统的评估。为了促进使用新方法的大规模实验，在Python建造了一个名为FSEVAL的基准测试框架。该框架允许并行运行实验，并在HPC系统上的计算机上分布。通过与称为“权重和偏见”的在线平台集成，可以在实时仪表板上进行交互探索图表。该软件以开源软件的形式发布，并在PYPI平台上以软件包的形式发布。研究结束时，通过探索一个这样的大规模实验，以在许多方面找到参与算法的优势和劣势。

Both in the domains of Feature Selection and Interpretable AI, there exists a desire to `rank' features based on their importance. Such feature importance rankings can then be used to either: (1) reduce the dataset size or (2) interpret the Machine Learning model. In the literature, however, such Feature Rankers are not evaluated in a systematic, consistent way. Many papers have a different way of arguing which feature importance ranker works best. This paper fills this gap, by proposing a new evaluation methodology. By making use of synthetic datasets, feature importance scores can be known beforehand, allowing more systematic evaluation. To facilitate large-scale experimentation using the new methodology, a benchmarking framework was built in Python, called fseval. The framework allows running experiments in parallel and distributed over machines on HPC systems. By integrating with an online platform called Weights and Biases, charts can be interactively explored on a live dashboard. The software was released as open-source software, and is published as a package on the PyPi platform. The research concludes by exploring one such large-scale experiment, to find the strengths and weaknesses of the participating algorithms, on many fronts.

下载PDF全文

下载文献需遵守相关版权规定

论文标题