Systematic Approach to Multi-layer Parallelisation of Time-based Stream Aggregation under Ingest Constraints in the Cloud

TRAN, Bao-Duy

TRAN, Bao-Duy

2014

Download

Formats

Format
BibTeX
MARC
MARCXML
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

With its real-time capabilities, stream processing is popular for applications like anomaly detection for residential gateways and analytics for business intelligence. Just as other areas of computing, there has been an inevitable trend to shift stream processing to the cloud, thanks to virtualisation technologies and the ubiquity of theWeb. Recently launched Amazon Kinesis is amongst cloud-based streambuffer services that bridge the gap between off-cloud sources and cloud-based processing engines. Yet such services are prone to commercial or physical constraints on data ingest rate, calling for the parallelisation and chaining of processing nodes in amulti-layer topology. In this work, we studied the multi-layer parallelisation of time-based stream aggregation, a commonplace component in stream processing applications, under the impact of ingest rate constraints in the cloud. In particular, comprehensive analyses on rate transfer properties of processing nodes at various aggregation layers were conducted by considering the stream sources (e.g. residential gateways) and their information flow. This led to our proposal of systematic approaches to determining a parallelisation topology that avoids ingest rate saturation while minimising operational costs and deployment complexity. By applying these approaches, system over-provisioning or trial-and-error design can be eliminated. Our analyses were empirically verified through various simulations. Prototyping in the real Kinesis environment was also conducted to back up our analytical results and proposed topology determination approaches. It is noteworthy that, although the work has been motivated by and prototyped with Amazon Kinesis, it remains generic in nature and its applicability can extend beyond the specific scenario of Kinesis.

Details

Title Systematic Approach to Multi-layer Parallelisation of Time-based Stream Aggregation under Ingest Constraints in the Cloud

Author(s) TRAN, Bao-Duy

Advisor(s)

LE MERRER, Erwan

Date 2014

Keywords

stream processing; cloud computing; Amazon Kinesis; stream buffer; ingest rate constraint; multi-layer parallelisation; time-based stream aggregation; rate transfer analysis; systematic approach; topology determination; provisioning.

Laboratories DCL

Record Appears in Scientific production and competences > I&C - School of Computer and Communication Sciences > IINFCOM > DCL - Distributed Computing Laboratory
Work outside EPFL
Student projects

Work type Master's Thesis

Record creation date 2015-12-28

Actions

Preview

Select file: