论文标题
多个输入的神经网络用于Medicare欺诈检测
Multiple Inputs Neural Networks for Medicare fraud Detection
论文作者
论文摘要
Medicare欺诈导致政府和保险公司造成了巨大损失,并从客户那里获得了更高的保费。在欧洲,Medicare欺诈的费用约为130亿欧元,在美国每年占210亿至710亿美元。这项研究旨在使用基于人造神经网络的分类器来预测医疗保险欺诈。在欺诈检测中使用机器学习技术或更一般的异常检测中的主要困难是数据集高度不平衡。为了检测Medicare欺诈,我们提出了一个具有长期术语内存(LSTM)自动编码器组件的深神经网络分类器的多个输入。该体系结构使您可以考虑许多数据源而无需混合数据来源,并使最终模型的分类任务更容易。从LSTM自动编码器中提取的潜在特征具有强大的歧视功率,并将提供商分为均匀的群集。我们使用美国联邦政府医疗补助和医疗保险服务中心(CMS)的数据集。 CMS提供了公开可用的数据,该数据将美国医院发送给Medicare公司发送的所有成本价格请求汇总。我们的结果表明,尽管基线人工神经网络具有良好的性能,但我们的多个输入神经网络表现出色。我们已经表明,使用LSTM自动编码器嵌入提供商的行为可以提供更好的结果,并使分类器更适合类不平衡。
Medicare fraud results in considerable losses for governments and insurance companies and results in higher premiums from clients. Medicare fraud costs around 13 billion euros in Europe and between 21 billion and 71 billion US dollars per year in the United States. This study aims to use artificial neural network based classifiers to predict medicare fraud. The main difficulty using machine learning techniques in fraud detection or more generally anomaly detection is that the data sets are highly imbalanced. To detect medicare frauds, we propose a multiple inputs deep neural network based classifier with a Long-short Term Memory (LSTM) autoencoder component. This architecture makes it possible to take into account many sources of data without mixing them and makes the classification task easier for the final model. The latent features extracted from the LSTM autoencoder have a strong discriminating power and separate the providers into homogeneous clusters. We use the data sets from the Centers for Medicaid and Medicare Services (CMS) of the US federal government. The CMS provides publicly available data that brings together all of the cost price requests sent by American hospitals to medicare companies. Our results show that although baseline artificial neural network give good performances, they are outperformed by our multiple inputs neural networks. We have shown that using a LSTM autoencoder to embed the provider behavior gives better results and makes the classifiers more robust to class imbalance.