论文标题
数据科学和机器学习的KAN扩展
Kan Extensions in Data Science and Machine Learning
论文作者
论文摘要
数据科学中的一个常见问题是“在此小集合上使用此功能来生成对该较大集合的预测”。外推,插值,统计推断和预测都减少了此问题。 KAN扩展是类别理论的强大工具,可以推广该概念。在这项工作中,我们探讨了KAN扩展对数据科学的几种应用。首先,我们将简单的分类算法作为KAN扩展,然后在真实数据上对此算法进行实验。接下来,我们使用KAN扩展程序来得出一个从标签中学习聚类算法的过程,并在实际数据上探索此过程的性能。然后,我们研究如何使用KAN扩展来学习从标记的示例数据集到函数的一般映射,并以更简单的功能近似复杂的函数。
A common problem in data science is "use this function defined over this small set to generate predictions over that larger set." Extrapolation, interpolation, statistical inference and forecasting all reduce to this problem. The Kan extension is a powerful tool in category theory that generalizes this notion. In this work we explore several applications of Kan extensions to data science. We begin by deriving a simple classification algorithm as a Kan extension and experimenting with this algorithm on real data. Next, we use the Kan extension to derive a procedure for learning clustering algorithms from labels and explore the performance of this procedure on real data. We then investigate how Kan extensions can be used to learn a general mapping from datasets of labeled examples to functions and to approximate a complex function with a simpler one.