论文标题
耳态:通过智能手机耳扬声器的微小振动来间谍来电者的言语和身份
EarSpy: Spying Caller Speech and Identity through Tiny Vibrations of Smartphone Ear Speakers
论文作者
论文摘要
从用户的智能手机中窃听是对用户安全和隐私的众所周知的威胁。现有的研究表明,扬声器混响可以将语音注入运动传感器读数,从而导致语音窃听。据信,虽然对耳扬声器的更毁灭性的攻击却产生了较小的规模振动,但被认为不可能用零处理运动传感器窃听。在这项工作中,我们重新审视了这一重要范围。我们探索智能手机制造商的最新趋势,这些趋势包括额外的/功能强大的扬声器代替小耳朵扬声器,并证明了使用运动传感器捕获如此微小的语音振动的可行性。我们研究了这些新耳话者对内置运动传感器的影响,并研究了从微小振动中引起私人语音信息的潜力。我们设计的系统耳波可以成功检测单词区域,时间和频域特征,并为每个单词区域生成一个频谱图。我们使用经典的机器学习算法和卷积神经网络训练和测试提取的数据。我们发现性别检测准确性高达98.66%,扬声器检测中的检测92.6%,数字检测中的检测为56.42%(比随机选择(10%)高5倍。我们的结果揭示了使用运动传感器窃听耳话者的电话交谈的潜在威胁。
Eavesdropping from the user's smartphone is a well-known threat to the user's safety and privacy. Existing studies show that loudspeaker reverberation can inject speech into motion sensor readings, leading to speech eavesdropping. While more devastating attacks on ear speakers, which produce much smaller scale vibrations, were believed impossible to eavesdrop with zero-permission motion sensors. In this work, we revisit this important line of reach. We explore recent trends in smartphone manufacturers that include extra/powerful speakers in place of small ear speakers, and demonstrate the feasibility of using motion sensors to capture such tiny speech vibrations. We investigate the impacts of these new ear speakers on built-in motion sensors and examine the potential to elicit private speech information from the minute vibrations. Our designed system EarSpy can successfully detect word regions, time, and frequency domain features and generate a spectrogram for each word region. We train and test the extracted data using classical machine learning algorithms and convolutional neural networks. We found up to 98.66% accuracy in gender detection, 92.6% detection in speaker detection, and 56.42% detection in digit detection (which is 5X more significant than the random selection (10%)). Our result unveils the potential threat of eavesdropping on phone conversations from ear speakers using motion sensors.