芬兰所说的不同瑞典方言的归一化

论文标题

芬兰所说的不同瑞典方言的归一化

Normalization of Different Swedish Dialects Spoken in Finland

论文作者

Hämäläinen, Mika, Partanen, Niko, Alnajjar, Khalid

论文摘要

我们的研究提出了一种涉及六个区域的芬兰方言的方言归一化方法。我们测试了5种不同的模型，最佳模型将单词错误率从76.45提高到28.58。与早期关于芬兰方言的研究中报道的结果相反，我们发现一次训练模型一次可以提供最佳的结果。我们认为这是由于该模型可用的培训数据的大小。我们的型号可以作为Python软件包访问。该研究提供了有关这些方法在不同情况下的适应性的重要信息，并为进一步研究提供了重要的基准。

Our study presents a dialect normalization method for different Finland Swedish dialects covering six regions. We tested 5 different models, and the best model improved the word error rate from 76.45 to 28.58. Contrary to results reported in earlier research on Finnish dialects, we found that training the model with one word at a time gave best results. We believe this is due to the size of the training data available for the model. Our models are accessible as a Python package. The study provides important information about the adaptability of these methods in different contexts, and gives important baselines for further study.

下载PDF全文

下载文献需遵守相关版权规定

论文标题