Stream Processing Systems Benchmark: StreamBench

 |  Login

Show simple item record

dc.contributor Aalto-yliopisto fi
dc.contributor Aalto University en
dc.contributor.advisor De Francisci Morales, Gianmarco
dc.contributor.author Wang, Yangjun
dc.date.accessioned 2016-06-17T12:45:48Z
dc.date.available 2016-06-17T12:45:48Z
dc.date.issued 2016-06-13
dc.identifier.uri https://aaltodoc.aalto.fi/handle/123456789/20991
dc.description.abstract Batch processing technologies (Such as MapReduce, Hive, Pig) have matured and been widely used in the industry. These systems solved the issue processing big volumes of data successfully. However, first big amount of data need to be collected and stored in a database or file system. That is very time-consuming. Then it takes time to finish batch processing analysis jobs before get any results. While there are many cases that need analysed results from unbounded sequence of data in seconds or sub-seconds. To satisfy the increasing demand of processing such streaming data, several streaming processing systems are implemented and widely adopted, such as Apache Storm, Apache Spark, IBM InfoSphere Streams, and Apache Flink. They all support online stream processing, high scalability, and tasks monitoring. While how to evaluate stream processing systems before choosing one in production development is an open question. In this thesis, we introduce StreamBench, a benchmark framework to facilitate performance comparisons of stream processing systems. A common API component and a core set of workloads are defined. We implement the common API and run benchmarks for three widely used open source stream processing systems: Apache Storm, Flink, and Spark Streaming. A key feature of the StreamBench framework is that it is extensible -- it supports easy definition of new workloads, in addition to making it easy to benchmark new stream processing systems. en
dc.format.extent 59
dc.format.mimetype application/pdf en
dc.language.iso en en
dc.title Stream Processing Systems Benchmark: StreamBench en
dc.type G2 Pro gradu, diplomityö fi
dc.contributor.school Perustieteiden korkeakoulu fi
dc.subject.keyword big data en
dc.subject.keyword stream processing en
dc.subject.keyword benchmark en
dc.subject.keyword distributed system en
dc.identifier.urn URN:NBN:fi:aalto-201606172599
dc.programme.major Foundations of Advanced Computing en
dc.programme.mcode SCI3014 fi
dc.type.ontasot Master's thesis en
dc.type.ontasot Diplomityö fi
dc.contributor.supervisor Gionis, Aristides
dc.programme Master’s Programme in Foundations of Advanced Computing (FAdCo) fi


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search archive


Advanced Search

article-iconSubmit a publication

Browse

My Account