论文标题
低潜伏期高吞吐量机学习算法的当然数据处理
At-the-edge Data Processing for Low Latency High Throughput Machine Learning Algorithms
论文作者
论文摘要
高通量和低延迟数据处理对于需要实时决策,控制和机器学习优化数据减少的系统至关重要。我们专注于两个不同的用例,用于在SLAC的LCLS-II自由电子激光器上进行X射线脉冲重建的机上流媒体数据处理,并在DIII-D TOKAMAK融合反应堆上控制诊断。两种情况都体现了高吞吐量和低延迟控制反馈,并激发了我们对机器学习的关注,在诊断传感器后,可以在基于现场可编程的门阵列中实现数据处理和机器学习算法。我们介绍了有关数据预处理链的最新工作,该链需要快速的信息来编码信息。我们讨论了此类算法的几种选择,主要关注我们离散的余弦和基于正弦变换的方法,适用于流媒体数据。这些算法主要旨在在现场可编程门阵列中实现,有利于线性代数操作,这也与计算边缘的推理加速器的最新进展保持一致。
High throughput and low latency data processing is essential for systems requiring live decision making, control, and machine learning-optimized data reduction. We focus on two distinct use cases for in-flight streaming data processing for a) X-ray pulse reconstruction at SLAC's LCLS-II Free-Electron Laser and b) control diagnostics at the DIII-D tokamak fusion reactor. Both cases exemplify high throughput and low latency control feedback and motivate our focus on machine learning at the edge where data processing and machine learning algorithms can be implemented in field programmable gate array based hardware immediately after the diagnostic sensors. We present our recent work on a data preprocessing chain which requires fast featurization for information encoding. We discuss several options for such algorithms with the primary focus on our discrete cosine and sine transform-based approach adapted for streaming data. These algorithms are primarily aimed at implementation in field programmable gate arrays, favoring linear algebra operations that are also aligned with the recent advances in inference accelerators for the computational edge.