论文标题

实时数据流的多租户酒吧/子处理

Multi-tenant Pub/Sub Processing for Real-time Data Streams

论文作者

Villalba, Álvaro, Carrera, David

论文摘要

设备和传感器在各种位置和协议中生成数据流。该数据通常到达用于存储和处理流的中央平台。可以实时进行处理,转换和富集在当时进行,但是在数据存储和组织存储库后也可以发生。在前一种情况下,需要流处理技术才能在数据上运行;在后一批中,分析和查询是常用的。 本文介绍了一个运行时,以基于用户提供的代码动态构建数据流处理拓扑。这些动态拓扑是使用由消耗数据的应用程序定义的数据订阅模型即时构建的。每个用户定义的处理单元称为服务对象。每个服务对象都会消耗输入数据流,并可能产生其他人可以消费的输出流。基于订阅的编程模型使多个用户可以部署自己的数据处理服务。运行时可以动态转发数据和来自不同用户的服务对象的执行。数据流可以起源于现实世界设备,也可以是服务对象的输出。

Devices and sensors generate streams of data across a diversity of locations and protocols. That data usually reaches a central platform that is used to store and process the streams. Processing can be done in real time, with transformations and enrichment happening on-the-fly, but it can also happen after data is stored and organized in repositories. In the former case, stream processing technologies are required to operate on the data; in the latter batch analytics and queries are of common use. This paper introduces a runtime to dynamically construct data stream processing topologies based on user-supplied code. These dynamic topologies are built on-the-fly using a data subscription model defined by the applications that consume data. Each user-defined processing unit is called a Service Object. Every Service Object consumes input data streams and may produce output streams that others can consume. The subscription-based programing model enables multiple users to deploy their own data-processing services. The runtime does the dynamic forwarding of data and execution of Service Objects from different users. Data streams can originate in real-world devices or they can be the outputs of Service Objects.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源