论文标题
SCITS:科学实验和工业互联网中时间序列数据库的基准
SciTS: A Benchmark for Time-Series Databases in Scientific Experiments and Industrial Internet of Things
论文作者
论文摘要
时间序列数据在工业互联网(IIT)和大规模科学实验中的使用量越来越不断增长。管理时间序列数据需要一个存储引擎,可以跟上其不断增长的量,同时提供可接受的查询延迟。尽管传统的酸数据库更喜欢一致性而不是性能,但是许多带有新型存储引擎的时间序列数据库可提供更好的摄入性能和较低的查询延迟。为了了解时间序列数据库的唯一设计如何影响其性能,我们设计了Scits,这是时间序列数据的高度可扩展和可参数化的基准。基准测试研究时间序列数据库的数据摄入功能,尤其是随着它们的尺寸增长。它还研究了科学实验用例中的5个实际查询的潜伏期。我们使用SCIT来评估4个不同存储引擎的4个数据库的性能:Clickhouse,InfluxDB,TimeScaledB和PostgreSQL。
Time-series data has an increasingly growing usage in Industrial Internet of Things (IIoT) and large-scale scientific experiments. Managing time-series data needs a storage engine that can keep up with their constantly growing volumes while providing an acceptable query latency. While traditional ACID databases favor consistency over performance, many time-series databases with novel storage engines have been developed to provide better ingestion performance and lower query latency. To understand how the unique design of a time-series database affects its performance, we design SciTS, a highly extensible and parameterizable benchmark for time-series data. The benchmark studies the data ingestion capabilities of time-series databases especially as they grow larger in size. It also studies the latencies of 5 practical queries from the scientific experiments use case. We use SciTS to evaluate the performance of 4 databases of 4 distinct storage engines: ClickHouse, InfluxDB, TimescaleDB, and PostgreSQL.