Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Journal articles
  4. Big-Data Streaming Applications Scheduling Based on Staged Multi-Armed Bandits
 
research article

Big-Data Streaming Applications Scheduling Based on Staged Multi-Armed Bandits

Kanoun, Karim  
•
Tekin, Cem
•
Atienza, David  
Show more
2016
IEEE Transactions on Computers

Several techniques have been recently proposed to adapt Big-Data streaming applications to existing many core platforms. Among these techniques, online reinforcement learning methods have been proposed that learn how to adapt at run-time the throughput and resources allocated to the various streaming tasks depending on dynamically changing data stream characteristics and the desired applications performance (e.g., accuracy). However, most of state-of-the-art techniques consider only one single stream input in its application model input and assume that the system knows the amount of resources to allocate to each task to achieve a desired performance. To address these limitations, in this paper we propose a new systematic and efficient methodology and associated algorithms for online learning and energy-efficient scheduling of Big-Data streaming applications with multiple streams on many core systems with resource constraints. We formalize the problem of multi-stream scheduling as a staged decision problem in which the performance obtained for various resource allocations is unknown. The proposed scheduling methodology uses a novel class of online adaptive learning techniques which we refer to as staged multi-armed bandits (S-MAB). Our scheduler is able to learn online which processing method to assign to each stream and how to allocate its resources over time in order to maximize the performance on the fly, at run-time, without having access to any offline information. The proposed scheduler, applied on a face detection streaming application and without using any offline information, is able to achieve similar performance compared to an optimal semi-online solution that has full knowledge of the input stream where the differences in throughput, observed quality, resource usage and energy efficiency are less than 1, 0.3, 0.2 and 4 percent respectively.

  • Files
  • Details
  • Metrics
Type
research article
DOI
10.1109/Tc.2016.2550454
Web of Science ID

WOS:000388498600007

Author(s)
Kanoun, Karim  
Tekin, Cem
Atienza, David  
Van Der Schaar, Mihaela
Date Issued

2016

Publisher

Institute of Electrical and Electronics Engineers

Published in
IEEE Transactions on Computers
Volume

65

Issue

12

Start page

3591

End page

3605

Subjects

Scheduling

•

machine learning

•

many-core platforms

•

data mining

•

big-data

•

multiple streams processing

•

concept drift

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units
ESL  
Available on Infoscience
January 24, 2017
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/133591
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés