机器学习可解释性符合TLS指纹识别

论文标题

机器学习可解释性符合TLS指纹识别

Machine Learning Interpretability Meets TLS Fingerprinting

论文作者

Siavoshani, Mahdi Jafari, Khajepour, Amir Hossein, Ziaei, Amirmohammad, Gatmiri, Amir Ali, Taheri, Ali

论文摘要

保护用户在互联网上的隐私非常重要。但是，由于网络协议和组件的复杂性的增加，维持越来越难。因此，研究和了解如何从信息传输平台和协议中泄漏数据可以使我们进入更安全的环境。在本文中，我们提出了一个框架，以系统地找到网络协议中最脆弱的信息字段。为此，着眼于传输层安全性（TLS）协议，我们对来自70多个域（网站）收集的数据进行了不同的基于机器学习的指纹攻击，以了解TLS协议中此信息泄漏发生的方式以及何处。然后，通过采用机器学习社区中开发的解释技术并应用我们的框架，我们在TLS协议中找到了最脆弱的信息字段。我们的发现表明，TLS握手（主要是未加密的），TLS记录长度出现在TLS应用程序数据标头中，初始化向量（IV）字段分别是该协议中最关键的泄漏器部分之一。

Protecting users' privacy over the Internet is of great importance; however, it becomes harder and harder to maintain due to the increasing complexity of network protocols and components. Therefore, investigating and understanding how data is leaked from the information transmission platforms and protocols can lead us to a more secure environment. In this paper, we propose a framework to systematically find the most vulnerable information fields in a network protocol. To this end, focusing on the transport layer security (TLS) protocol, we perform different machine-learning-based fingerprinting attacks on the collected data from more than 70 domains (websites) to understand how and where this information leakage occurs in the TLS protocol. Then, by employing the interpretation techniques developed in the machine learning community and applying our framework, we find the most vulnerable information fields in the TLS protocol. Our findings demonstrate that the TLS handshake (which is mainly unencrypted), the TLS record length appearing in the TLS application data header, and the initialization vector (IV) field are among the most critical leaker parts in this protocol, respectively.

下载PDF全文

下载文献需遵守相关版权规定

论文标题