论文标题
通过提高独特的相关性来改善基于信息的功能选择
Improving Mutual Information based Feature Selection by Boosting Unique Relevance
论文作者
论文摘要
基于共同信息(MI)的功能选择使用MI来评估每个功能,并最终列出相关特征子集,以解决与高维数据集相关的问题。尽管MI在功能选择方面具有有效性,但我们注意到许多最先进的算法无视所谓的独特相关性(UR),并到达一个次优选定的特征子集,其中包含不可忽略的冗余功能数量。我们指出,问题的核心是所有这些MIBF算法遵循最大化与最小冗余(MRWMR)的相关性的标准,该算法并未明确靶向UR。这促使我们以提高独特相关性(BUR)的目的增强现有标准,从而导致了一个名为MRWMR-BUR的新标准。根据所解决的任务,MRWMR-BUR具有两个变体,称为MRWMR-BUR-KSG和MRWMR-BUR-CLF,它们以不同的方式估计您。 MRWMR-BUR-KSG通过一种名为KSG估计器的基于最近的邻居方法估算了UR,并专为三个主要任务而设计:(i)分类性能。 (ii)功能可解释性。 (iii)分类器概括。 MRWMR-BUR-CLF通过基于分类器的方法估算您的。它使您适应不同的分类器,进一步提高了MRWMR-BUR在分类方面的任务中的竞争力。 MRWMR-BUR-KSG和MRWMR-BUR-CLF的性能都可以通过使用六个公共数据集和三个流行分类器进行验证。具体而言,与MRWMR相比,提出的MRWMR -BUR -KSG提高了测试准确性2%-3%,而选择的功能少25% - 30%,而不会增加算法复杂性。 MRWMR-BUR-CLF进一步将分类性能提高了3.8%-5.5%(相对于MRWMR),它还优于三种流行的分类器依赖性特征选择方法。
Mutual Information (MI) based feature selection makes use of MI to evaluate each feature and eventually shortlists a relevant feature subset, in order to address issues associated with high-dimensional datasets. Despite the effectiveness of MI in feature selection, we notice that many state-of-the-art algorithms disregard the so-called unique relevance (UR) of features, and arrive at a suboptimal selected feature subset which contains a non-negligible number of redundant features. We point out that the heart of the problem is that all these MIBFS algorithms follow the criterion of Maximize Relevance with Minimum Redundancy (MRwMR), which does not explicitly target UR. This motivates us to augment the existing criterion with the objective of boosting unique relevance (BUR), leading to a new criterion called MRwMR-BUR. Depending on the task being addressed, MRwMR-BUR has two variants, termed MRwMR-BUR-KSG and MRwMR-BUR-CLF, which estimate UR differently. MRwMR-BUR-KSG estimates UR via a nearest-neighbor based approach called the KSG estimator and is designed for three major tasks: (i) Classification Performance. (ii) Feature Interpretability. (iii) Classifier Generalization. MRwMR-BUR-CLF estimates UR via a classifier based approach. It adapts UR to different classifiers, further improving the competitiveness of MRwMR-BUR for classification performance oriented tasks. The performance of both MRwMR-BUR-KSG and MRwMR-BUR-CLF is validated via experiments using six public datasets and three popular classifiers. Specifically, as compared to MRwMR, the proposed MRwMR-BUR-KSG improves the test accuracy by 2% - 3% with 25% - 30% fewer features being selected, without increasing the algorithm complexity. MRwMR-BUR-CLF further improves the classification performance by 3.8%- 5.5% (relative to MRwMR), and it also outperforms three popular classifier dependent feature selection methods.