Queryable Workflows: Extending Dataflow Streaming with Dynamic Request/Reply Communication

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Abstract: Stream processing systems have been widely adopted in applications such as recommendation systems, anomaly detection, and system monitoring due to their real-time capabilities. Improving observability in stream processing systems can further expand their application scenarios, including the implementation of stateful serverless applications. Stateful serverless applications are an emerging model in serverless computing that focuses on addressing the challenges of state management, enabling developers to build distributed applications in a simpler way. One possible implementation of stateful serverless applications is based on stream processing engines. However, the current approaches for observability in stream processing engines suffer from issues such as efficiency, consistency, and functionality, resulting in limited practical use cases. To address these challenges, we propose Queryable Workflow, an extension to stream processing engines. This extension allows users to access or modify the state within stream processing engines with transactional semantics using a SQL interface, enabling use cases such as ad-hoc querying, serializable updates, or even stateful serverless applications. We implemented our system on stream processing engines such as Portals and Apache Flink, and evaluated their performance. The result showed that our system has achieved 4.33x throughput improvement and 30% latency reduction compared to a baseline implemented with Apache Flink and Apache Kafka. With hand-crafted optimizations, our system achieved to process over 29,000 queries per second with a 99th percentile latency of 8.58 ms under a single-threaded runtime. Our proposed system provides a viable option for implementing stateful serverless applications that require transactional guarantees, while also expanding the potential application scenarios for stream processing engines.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)