基于图的神经模块检查基于注意的架构：位置纸

论文标题

基于图的神经模块检查基于注意的架构：位置纸

Graph-based Neural Modules to Inspect Attention-based Architectures: A Position Paper

论文作者

Carvalho, Breno W., Garcez, Artur D'Avilla, Lamb, Luis C.

论文摘要

编码器架构是针对深度学习（DL）或基础模型起着关键作用的多个领域任务的最先进解决方案的重要构建块。尽管越来越多的社区致力于为DL模型提供解释，以及在神经符号社区中寻求整合符号表示和DL的大量工作，但仍有许多开放的问题仍然需要更好地可视化DL架构内部工作的工具。尤其是，编码器模型为人类提供了一个令人兴奋的机会，可以通过人类对模型权重中隐含表示的知识进行编辑。在这项工作中，我们探讨了为网络段创建抽象作为基于双向图的表示的方法。该图结构的更改应直接反映在基础张量表示中。这样的双向图表示可以通过利用编码器的模式识别能力以及在图上进行的符号推理来实现新的神经符号系统。预计该方法将产生与DL模型互动的新方法，同时还可以通过学习和推理能力的结合来提高性能。

Encoder-decoder architectures are prominent building blocks of state-of-the-art solutions for tasks across multiple fields where deep learning (DL) or foundation models play a key role. Although there is a growing community working on the provision of interpretation for DL models as well as considerable work in the neuro-symbolic community seeking to integrate symbolic representations and DL, many open questions remain around the need for better tools for visualization of the inner workings of DL architectures. In particular, encoder-decoder models offer an exciting opportunity for visualization and editing by humans of the knowledge implicitly represented in model weights. In this work, we explore ways to create an abstraction for segments of the network as a two-way graph-based representation. Changes to this graph structure should be reflected directly in the underlying tensor representations. Such two-way graph representation enables new neuro-symbolic systems by leveraging the pattern recognition capabilities of the encoder-decoder along with symbolic reasoning carried out on the graphs. The approach is expected to produce new ways of interacting with DL models but also to improve performance as a result of the combination of learning and reasoning capabilities.

下载PDF全文

下载文献需遵守相关版权规定

论文标题