论文标题

Cocopie XGEN:全栈AI面向AI的优化框架

CoCoPIE XGen: A Full-Stack AI-Oriented Optimizing Framework

论文作者

Li, Xiaofeng, Ren, Bin, Shen, Xipeng, Wang, Yanzhi

论文摘要

对将AI功能从云上的数据中心转移到边缘或最终设备的需求不断增长,这是通过在智能手机,AR/VR设备,自动驾驶汽车和各种IOT设备上运行的快速实时AI基于AI的应用的例证。然而,由于DNN计算需求与边缘或最终设备上的计算能力之间的巨大增长差距,这一转变受到了严重的阻碍。本文介绍了XGEN的设计,这是DNN的优化框架,旨在弥合差距。 XGEN将横切共同设计作为其一阶考虑。它的全堆栈AI面向AI的优化由DNN软件堆栈的各个层的许多创新优化组成,所有优化都以合作的方式设计。独特的技术使XGEN能够优化各种DNN,包括具有极高深度的DNN(例如Bert,GPT,其他变形金刚),并生成代码,该代码比现有DNN框架中的代码快几倍,同时提供相同的准确性。

There is a growing demand for shifting the delivery of AI capability from data centers on the cloud to edge or end devices, exemplified by the fast emerging real-time AI-based apps running on smartphones, AR/VR devices, autonomous vehicles, and various IoT devices. The shift has however been seriously hampered by the large growing gap between DNN computing demands and the computing power on edge or end devices. This article presents the design of XGen, an optimizing framework for DNN designed to bridge the gap. XGen takes cross-cutting co-design as its first-order consideration. Its full-stack AI-oriented optimizations consist of a number of innovative optimizations at every layer of the DNN software stack, all designed in a cooperative manner. The unique technology makes XGen able to optimize various DNNs, including those with an extreme depth (e.g., BERT, GPT, other transformers), and generate code that runs several times faster than those from existing DNN frameworks, while delivering the same level of accuracy.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源