Scalability Benchmarking of Stream Processing Engines with Apache Beam.

Bensien, Jan Robert (2021) Scalability Benchmarking of Stream Processing Engines with Apache Beam. Open Access (Bachelor thesis), Kiel University, Kiel, 69 pp.

[thumbnail of bsc_jan-robert-bensien_thesis.pdf]
Preview
Text
bsc_jan-robert-bensien_thesis.pdf - Published Version

Download (1MB) | Preview

Abstract

The rapid increase in data due to modern developments, such as Industry 4.0, Internet of Things, artificial intelligence or the analysis of user interaction on websites led to a plethora of new technologies dealing with the challenges associated with the four Vs of Big Data. One approach for dealing with these challenges, stream processing, gained special interest due to its ability to handle high loads while providing real-time results. Due to this, many different stream processing frameworks emerged, each with its own design decisions.
In this thesis, we will benchmark the horizontal scalability of two stream processing frameworks. To achieve this, we will apply the benchmarking method Theodolite and benchmark four already identified use cases. We will focus on Apache Beam as a programming API using Apache Samza and Apache Flink as stream processing backends. Additionally, we will adjust the framework of Theodolite to support the execution of Beam pipelines using Apache Flink and Apache Samza as stream processing engines. Following, we will take the necessary steps to execute our use cases in a private cloud environment, orchestrated by Kubernetes. We found that using Flink as processing backend is advantageous in three use cases, while using Samza as processing backend results in better results for one use case.

Document Type: Thesis (Bachelor thesis)
Thesis Advisor: Hasselbring, Wilhelm and Henning, Sören
Keywords: Stream Processing, Benchmark, Scalability, Theodolite
Research affiliation: Kiel University > Software Engineering
Date Deposited: 20 Apr 2021 14:34
Last Modified: 20 Apr 2021 14:34
URI: https://oceanrep.geomar.de/id/eprint/52342

Actions (login required)

View Item View Item