论文标题
深度学习驱动的自然语言文字有关SQL查询转换:调查
Deep Learning Driven Natural Languages Text to SQL Query Conversion: A Survey
论文作者
论文摘要
随着未来以数据为中心的决策,对数据库的无缝访问至关重要。关于创建有效的文本到SQL(Text2SQL)模型以访问数据库的数据有广泛的研究。使用自然语言是可以通过有效访问数据库(尤其是对于非技术用户)来弥合数据和结果之间差距的最佳接口之一。它将打开门,并在精通技术技能或不太熟练的查询语言的用户中引起极大的兴趣。即使提出或研究了许多基于深度学习的算法,在现实工作场景中使用自然语言解决数据查询问题仍然非常具有挑战性。原因是在不同的研究中使用不同的数据集,这是由于其局限性和假设所带来的。同时,我们确实缺乏对这些提议的模型及其对其训练的特定数据集的局限性的彻底理解。在本文中,我们试图介绍过去几年研究的24种神经网络模型的整体概述,包括涉及卷积神经网络,经常性神经网络,指针网络,指针网络,增强学习,生成模型等的架构。我们还广泛地用于训练Text2sql Text2sql技术的11个数据集的概述。我们还讨论了无缝数据查询的Text2SQL技术的未来应用可能性。
With the future striving toward data-centric decision-making, seamless access to databases is of utmost importance. There is extensive research on creating an efficient text-to-sql (TEXT2SQL) model to access data from the database. Using a Natural language is one of the best interfaces that can bridge the gap between the data and results by accessing the database efficiently, especially for non-technical users. It will open the doors and create tremendous interest among users who are well versed in technical skills or not very skilled in query languages. Even if numerous deep learning-based algorithms are proposed or studied, there still is very challenging to have a generic model to solve the data query issues using natural language in a real-work scenario. The reason is the use of different datasets in different studies, which comes with its limitations and assumptions. At the same time, we do lack a thorough understanding of these proposed models and their limitations with the specific dataset it is trained on. In this paper, we try to present a holistic overview of 24 recent neural network models studied in the last couple of years, including their architectures involving convolutional neural networks, recurrent neural networks, pointer networks, reinforcement learning, generative models, etc. We also give an overview of the 11 datasets that are widely used to train the models for TEXT2SQL technologies. We also discuss the future application possibilities of TEXT2SQL technologies for seamless data queries.