实验性地实现了纳米局神经网络中深度学习的原位反向传播

论文标题

实验性地实现了纳米局神经网络中深度学习的原位反向传播

Experimentally realized in situ backpropagation for deep learning in nanophotonic neural networks

论文作者

Pai, Sunil, Sun, Zhanghao, Hughes, Tyler W., Park, Taewon, Bartlett, Ben, Williamson, Ian A. D., Minkov, Momchil, Milanizadeh, Maziyar, Abebe, Nathnael, Morichetti, Francesco, Melloni, Andrea, Fan, Shanhui, Solgaard, Olav, Miller, David A. B.

论文摘要

神经网络是许多科学学科中广泛部署的模型，以及从边缘计算和传感到数据中心中大规模信号处理的商业努力。训练此类网络的最有效，最受欢迎的方法是反向传播或反向模式自动分化。为了应对人工智能部门的能源预算呈指数增长，最近人们对神经网络的模拟实施，特别是纳米局神经网络的兴趣，不存在模拟反向传播示范的纳米光子神经网络。我们设计了可制造的可制造硅光子神经网络，该网络通过数字实现的非线性交替级联我们设计的“光子网格”加速器。这些可重新配置的光子网格程序通过设置物理电压来计算密集型任意矩阵乘法，从而调整了通过集成的Mach-Zehnder干涉仪网络传播的光学编码输入数据的干扰。在这里，使用包装的光子芯片，我们首次证明了原位反向传播，以解决分类任务并评估新协议，以保持整个梯度测量和模拟域中物理设备电压的更新，从而改善了过去的理论建议。通过引入典型光子网格的三个更改，我们的方法可以实现：（1）光学“光栅龙头”监测器的测量，（2）双向光信号传播由纤维开关自动化，以及（3）通用产生和读取光学振幅和相位。训练后，即使存在系统误差，我们的分类也可以达到类似于数字当量的准确性。我们的发现建议完全基于流行的反向传播技术的物理类似物，针对光子学加速的人工智能进行了新的培训范式。

Neural networks are widely deployed models across many scientific disciplines and commercial endeavors ranging from edge computing and sensing to large-scale signal processing in data centers. The most efficient and well-entrenched method to train such networks is backpropagation, or reverse-mode automatic differentiation. To counter an exponentially increasing energy budget in the artificial intelligence sector, there has been recent interest in analog implementations of neural networks, specifically nanophotonic neural networks for which no analog backpropagation demonstration exists. We design mass-manufacturable silicon photonic neural networks that alternately cascade our custom designed "photonic mesh" accelerator with digitally implemented nonlinearities. These reconfigurable photonic meshes program computationally intensive arbitrary matrix multiplication by setting physical voltages that tune the interference of optically encoded input data propagating through integrated Mach-Zehnder interferometer networks. Here, using our packaged photonic chip, we demonstrate in situ backpropagation for the first time to solve classification tasks and evaluate a new protocol to keep the entire gradient measurement and update of physical device voltages in the analog domain, improving on past theoretical proposals. Our method is made possible by introducing three changes to typical photonic meshes: (1) measurements at optical "grating tap" monitors, (2) bidirectional optical signal propagation automated by fiber switch, and (3) universal generation and readout of optical amplitude and phase. After training, our classification achieves accuracies similar to digital equivalents even in presence of systematic error. Our findings suggest a new training paradigm for photonics-accelerated artificial intelligence based entirely on a physical analog of the popular backpropagation technique.

下载PDF全文

下载文献需遵守相关版权规定

论文标题