论文标题
分段数值替换密码
Segmenting Numerical Substitution Ciphers
论文作者
论文摘要
解密的历史替代密码是一个具有挑战性的问题。先前研究的示例问题包括检测密码类型,检测明文语言以及获取分段密码的替换密钥。但是,攻击未分段,无空间密码仍然是一项具有挑战性的任务。分割(即找到替代单位)是破解这些密码的第一步。在这项工作中,我们提出了第一个使用字节对编码(BPE)和Unigram语言模型分割这些密码的自动方法。我们的方法在100个随机生成的单个单位密码中达到平均分割误差为2 \%,在3个实际谐音密码上达到27 \%。我们还提出了一种使用晶格和验证的语言模型来解决现有密钥的非确定性密码的方法。我们的方法导致IA密码的完整解。一个真正的历史密码,直到这项工作才得到充分解决。
Deciphering historical substitution ciphers is a challenging problem. Example problems that have been previously studied include detecting cipher type, detecting plaintext language, and acquiring the substitution key for segmented ciphers. However, attacking unsegmented, space-free ciphers is still a challenging task. Segmentation (i.e. finding substitution units) is the first step towards cracking those ciphers. In this work, we propose the first automatic methods to segment those ciphers using Byte Pair Encoding (BPE) and unigram language models. Our methods achieve an average segmentation error of 2\% on 100 randomly-generated monoalphabetic ciphers and 27\% on 3 real homophonic ciphers. We also propose a method for solving non-deterministic ciphers with existing keys using a lattice and a pretrained language model. Our method leads to the full solution of the IA cipher; a real historical cipher that has not been fully solved until this work.