分析可区分的模糊逻辑运算符

论文标题

分析可区分的模糊逻辑运算符

Analyzing Differentiable Fuzzy Logic Operators

论文作者

van Krieken, Emile, Acar, Erman, van Harmelen, Frank

论文摘要

AI社区越来越多地将注意力集中在结合符号和神经方法上，因为人们通常认为这些方法的优点和缺点是互补的。文献的最新趋势是弱小的监督学习技术，这些学习技术是从模糊逻辑中雇用运营商的。特别是，这些逻辑中描述的先前背景知识可以帮助您从未标记和嘈杂的数据培训神经网络。通过使用神经网络来解释逻辑符号，可以将这些背景知识添加到常规损失功能中，从而使推理成为学习的一部分。我们从正式和经验上研究了来自模糊逻辑文献中大量逻辑运算符在可区分的学习环境中的表现。我们发现，其中许多运营商，包括一些最著名的运营商在这种情况下非常不适合。进一步的发现涉及对这些模糊逻辑中含义的处理，并显示出由先决条件驱动的梯度和含义的造成的梯度之间的强烈失衡。此外，我们引入了一个新的模糊含义（称为sigmoidal含义）来解决这一现象。最后，我们从经验上表明，可以使用可区分的模糊逻辑进行半监督学习，并比较不同的运营商在实践中的行为。我们发现，为了在监督基准方面取得最大的性能改进，我们必须诉诸于在学习方面表现良好但不再满足通常的逻辑定律的逻辑运营商的非标准组合。

The AI community is increasingly putting its attention towards combining symbolic and neural approaches, as it is often argued that the strengths and weaknesses of these approaches are complementary. One recent trend in the literature are weakly supervised learning techniques that employ operators from fuzzy logics. In particular, these use prior background knowledge described in such logics to help the training of a neural network from unlabeled and noisy data. By interpreting logical symbols using neural networks, this background knowledge can be added to regular loss functions, hence making reasoning a part of learning. We study, both formally and empirically, how a large collection of logical operators from the fuzzy logic literature behave in a differentiable learning setting. We find that many of these operators, including some of the most well-known, are highly unsuitable in this setting. A further finding concerns the treatment of implication in these fuzzy logics, and shows a strong imbalance between gradients driven by the antecedent and the consequent of the implication. Furthermore, we introduce a new family of fuzzy implications (called sigmoidal implications) to tackle this phenomenon. Finally, we empirically show that it is possible to use Differentiable Fuzzy Logics for semi-supervised learning, and compare how different operators behave in practice. We find that, to achieve the largest performance improvement over a supervised baseline, we have to resort to non-standard combinations of logical operators which perform well in learning, but no longer satisfy the usual logical laws.

下载PDF全文

下载文献需遵守相关版权规定

论文标题