论文标题
自动识别语音成绩单中代码转换的动机
Automatic Identification of Motivation for Code-Switching in Speech Transcripts
论文作者
论文摘要
代码转换或在语言之间进行切换,出于多种原因而发生,并且具有重要的语言,社会学和文化含义。多语言扬声器用于各种目的,例如表达情感,借用术语,开玩笑,介绍新主题等。代码转换的原因可能对分析非常有用,但并不容易看出。为了纠正这种情况,我们注释了新的西班牙语转换动机的数据集。据我们所知,我们构建了第一个系统,以自动确定在日常演讲中代码转换的广泛动机,在所有动机中都达到了75%的准确性。此外,我们证明该系统可以适应新的语言对,在新语言对(印度语)上达到66%的精度,展示了我们的注释方案的跨语性适用性
Code-switching, or switching between languages, occurs for many reasons and has important linguistic, sociological, and cultural implications. Multilingual speakers code-switch for a variety of purposes, such as expressing emotions, borrowing terms, making jokes, introducing a new topic, etc. The reason for code-switching may be quite useful for analysis, but is not readily apparent. To remedy this situation, we annotate a new dataset of motivations for code-switching in Spanish-English. We build the first system (to our knowledge) to automatically identify a wide range of motivations that speakers code-switch in everyday speech, achieving an accuracy of 75% across all motivations. Additionally, we show that the system can be adapted to new language pairs, achieving 66% accuracy on a new language pair (Hindi-English), demonstrating the cross-lingual applicability of our annotation scheme