论文标题
以数据为中心的绿色AI:一项探索性实证研究
Data-Centric Green AI: An Exploratory Empirical Study
论文作者
论文摘要
随着大型数据集的可用性的日益增长,以及负担得起的存储和计算能力的普及,AI消耗的能量正变得越来越关注。为了解决这个问题,近年来,研究重点是通过调整模型培训策略来证明如何提高AI能源效率。然而,应用于数据集的修改如何影响AI的能源消耗仍然是一个悬而未决的问题。为了填补这一空白,在这项探索性研究中,我们评估了是否可以利用以数据为中心的方法来提高AI的能源效率。为了实现我们的目标,我们通过考虑6种不同的AI算法,一个包含5,574个数据点的数据集和两个数据集修改(数据点数量和功能数量)来执行经验实验。我们的结果表明,通过对数据集进行修改,可以大幅度降低能耗(高达92.16%),通常是以可忽略不计甚至准确性下降的代价。作为附加的介绍结果,我们证明了如何通过更改所使用的算法,可以实现节省多达两个数量级的能源。总之,这项探索性研究在经验上证明了以数据为中心技术来提高AI能源效率的重要性。我们的结果要求研究以数据为中心的技术的研究议程,以进一步使绿色AI民主化。
With the growing availability of large-scale datasets, and the popularization of affordable storage and computational capabilities, the energy consumed by AI is becoming a growing concern. To address this issue, in recent years, studies have focused on demonstrating how AI energy efficiency can be improved by tuning the model training strategy. Nevertheless, how modifications applied to datasets can impact the energy consumption of AI is still an open question. To fill this gap, in this exploratory study, we evaluate if data-centric approaches can be utilized to improve AI energy efficiency. To achieve our goal, we conduct an empirical experiment, executed by considering 6 different AI algorithms, a dataset comprising 5,574 data points, and two dataset modifications (number of data points and number of features). Our results show evidence that, by exclusively conducting modifications on datasets, energy consumption can be drastically reduced (up to 92.16%), often at the cost of a negligible or even absent accuracy decline. As additional introductory results, we demonstrate how, by exclusively changing the algorithm used, energy savings up to two orders of magnitude can be achieved. In conclusion, this exploratory investigation empirically demonstrates the importance of applying data-centric techniques to improve AI energy efficiency. Our results call for a research agenda that focuses on data-centric techniques, to further enable and democratize Green AI.