论文标题
进行性清洁和挖掘不确定的智能水表数据
Progressive Cleaning and Mining of Uncertain Smart Water Meter Data
论文作者
论文摘要
最近,一些市政当局安装了无线“智能”水表,允许功能,例如需求响应,泄漏警报,特征需求模式的识别以及详细的消费分析。为了实现这些好处,仪表数据需要无错误,这在实践中不一定可用,这是由于“肮脏”或“不确定性”的数据,这主要是不可避免的。 本文的重点是研究实用解决方案,以挖掘不确定的数据以获得可靠的结果并评估脏数据对过滤器的影响。这项评估最终将导致有价值的信息,可用于对水计划策略进行良好的决策。我们对大规模智能水表部署中存在的错误进行系统的研究,这有助于更好地理解错误的性质。 确定对负载峰的贡献的客户用作主要过滤器。然后将过滤器输出与域专家知识相结合,以评估其准确性和有效性,并寻找潜在的错误。发现每个错误后,我们在数据中分析其跟踪并跟踪其源,这最终会导致删除错误或相应地处理错误。逐步应用此过程,以确保在数据模型中发现和表征所有可检测的错误。 我们使用从加拿大不列颠哥伦比亚省阿布斯福德市获得的智能水表消耗数据评估了拟议方法的性能。我们介绍了未经处理和清洁数据的结果,并详细分析了所选过滤器对错误的敏感性。
Several municipalities have recently installed wireless 'smart' water meters that allow functionalities such as demand response, leak alerts, identification of characteristic demand patterns, and detailed consumption analysis. To achieve these benefits, the meter data needs to be error-free, which is not necessarily available in practice, due to 'dirtiness' or 'uncertainty' of data, which is mostly unavoidable. The focus of this paper is to investigate practical solutions to mine uncertain data for reliable results and to evaluate the impact of dirty data on filters. This evaluation would eventually lead to valuable information, which can be used for educated decision making on water planning strategies. We perform a systematic study of the errors existing in a large-scale smart water meter deployments, which is helpful to better understand the nature of errors. Identifying customers contributing to a load peak is used as the main filter. The filter outputs are then combined with the domain expert knowledge to evaluate their accuracy and validity and also to look for potential errors. After discovering each error, we analyze its trails in the data and track back its source, which would eventually lead to the removal of the error or dealing with it accordingly. This procedure is applied progressively to ensure that all detectable errors are discovered and characterized in the data model. We evaluate the performance of the proposed approach using the smart water meter consumption data obtained from the City of Abbotsford, British Columbia, Canada. We present the results of both unprocessed and cleaned data and analyze, in detail, the sensitivity of the selected filter to the errors.