论文标题
Pairs Autogeo:大量地理空间数据的自动化机器学习框架
PAIRS AutoGeo: an Automated Machine Learning Framework for Massive Geospatial Data
论文作者
论文摘要
IBM Pairs Geoscope Big Data和Analytics平台引入了一个名为Pairs Autogeo的地理空间数据的自动化机器学习框架。该框架简化了利用地理空间数据的工业机器学习解决方案的开发,以至于将用户输入最小化至仅包含标有GPS坐标的文本文件。 Pairs Autogeo会自动收集需要在位置坐标处收集数据,组装培训数据,执行质量检查并训练多个机器学习模型以进行后续部署。使用树种分类的现实工业用例来验证该框架。开源树种数据被用作基于空中图像的10向树种分类的随机森林分类器和修改的重新网络模型的输入,该模型的准确性分别为$ 59.8 \%\%$ $和$ 81.4 \%\%$ $。这种用例说明了对Autogeo的配对如何使用户能够在没有广泛的地理空间专业知识的情况下利用机器学习。
An automated machine learning framework for geospatial data named PAIRS AutoGeo is introduced on IBM PAIRS Geoscope big data and analytics platform. The framework simplifies the development of industrial machine learning solutions leveraging geospatial data to the extent that the user inputs are minimized to merely a text file containing labeled GPS coordinates. PAIRS AutoGeo automatically gathers required data at the location coordinates, assembles the training data, performs quality check, and trains multiple machine learning models for subsequent deployment. The framework is validated using a realistic industrial use case of tree species classification. Open-source tree species data are used as the input to train a random forest classifier and a modified ResNet model for 10-way tree species classification based on aerial imagery, which leads to an accuracy of $59.8\%$ and $81.4\%$, respectively. This use case exemplifies how PAIRS AutoGeo enables users to leverage machine learning without extensive geospatial expertise.