论文标题

Smart-Corpus:一个有组织的以太坊智能合约源代码和指标的存储库

Smart-Corpus: an Organized Repository of Ethereum Smart Contracts Source Code and Metrics

论文作者

Pierro, Giuseppe Antonio, Tonelli, Roberto, Marchesi, Michele

论文摘要

许多经验软件工程研究表明,在获取,过滤和分类的源代码的存储库中非常需要。在过去的几年中,Ethereum Block Explorer服务已成为一个受欢迎的项目,旨在探索和搜索以太坊区块链数据,例如交易,地址,令牌,智能合同的源代码,价格和其他活动,在以太坊区块链上进行。尽管有此类服务可用,但检索对经验软件工程研究有用的特定信息,例如对Smart-Contracts的软件指标的研究可能需要许多子任务,例如在块中搜索特定的交易,以HTML格式解析文件并过滤智能合同以删除重复的代码或未使用的智能符号或未使用的智能指标。在本文中,我们提供了一个创建智能语料库的问题,这是一个有组织的理性和最新存储库中的智能合约,在该库中坚固源代码和其他有关以太坊智能合约的元数据可以轻松而系统地检索。我们介绍了Smart Corpus的设计及其初始实施,并展示了如何查询和处理各种编程语言的智能合约源代码的数据集,获取有关智能合约及其软件指标的有用信息。智能语料库旨在创建一个智能合同的存储库,其中智能合约数据(源代码,ABI和字节代码)是自由且立即可用的,并且也基于科学文献中确定的主要软件指标进行了分类。 Smart Contracts源代码已通过Etherscan验证,每份合同都带有由可自由使用的软件PASO计算的自己的相关软件指标。此外,随着新的智能合同的数量每天增加,智能语料库很容易扩展。

Many empirical software engineering studies show that there is a great need for repositories where source code is acquired, filtered and classified. During the last few years, Ethereum block explorer services have emerged as a popular project to explore and search Ethereum blockchain data such as transactions, addresses, tokens, smart-contracts' source code, prices and other activities taking place on the Ethereum blockchain. Despite the availability of this kind of services, retrieving specific information useful to empirical software engineering studies, such as the study of smart-contracts' software metrics might require many sub-tasks, such as searching specific transactions in a block, parsing files in HTML format and filtering the smart-contracts to remove duplicated code or unused smart-contracts. In this paper we afford this problem creating Smart Corpus', a Corpus of Smart Contracts in an organized reasoned and up to date repository where Solidity source code and other metadata about Ethereum smart contracts can easily and systematically be retrieved. We present the Smart Corpus' design and its initial implementation and we show how the data-set of smart contracts' source code in a variety of programming languages can be queried and processed, get useful information on smart contracts and their software metrics. The Smart Corpus aims to create a smart-contracts' repository where smart contracts data (source code, ABI and byte-code) are freely and immediately available and also classified based on the main software metrics identified in the scientific literature. Smart-contracts source code has been validated by EtherScan and each contract comes with its own associated software metrics as computed by the freely available software PASO. Moreover, Smart Corpus can be easily extended, as the number of new smart-contracts increases day by day.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源