Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Journal articles
  4. Adaptive partitioning and indexing for in situ query processing
 
research article

Adaptive partitioning and indexing for in situ query processing

Olma, Matthaios  
•
Karpathiotakis, Manos  
•
Alagiannis, Ioannis  
Show more
January 1, 2020
Vldb Journal

The constant flux of data and queries alike has been pushing the boundaries of data analysis systems. The increasing size of raw data files has made data loading an expensive operation that delays the data-to-insight time. To alleviate the loading cost, in situ query processing systems operate directly over raw data and offer instant access to data. At the same time, analytical workloads have increasing number of queries. Typically, each query focuses on a constantly shifting-yet small-range. As a result, minimizing the workload latency requires the benefits of indexing in in situ query processing. In this paper, we present an online partitioning and indexing scheme, along with a partitioning and indexing tuner tailored for in situ querying engines. The proposed system design improves query execution time by taking into account user query patterns, to (i) partition raw data files logically and (ii) build lightweight partition-specific indexes for each partition. We build an in situ query engine called Slalom to showcase the impact of our design. Slalom employs adaptive partitioning and builds non-obtrusive indexes in different partitions on-the-fly based on lightweight query access pattern monitoring. As a result of its lightweight nature, Slalom achieves efficient query processing over raw data with minimal memory consumption. Our experimentation with both microbenchmarks and real-life workloads shows that Slalom outperforms state-of-the-art in situ engines and achieves comparable query response times with fully indexed DBMS, offering lower cumulative query execution times for query workloads with increasing size and unpredictable access patterns.

  • Details
  • Metrics
Type
research article
DOI
10.1007/s00778-019-00580-x
Web of Science ID

WOS:000512106800021

Author(s)
Olma, Matthaios  
Karpathiotakis, Manos  
Alagiannis, Ioannis  
Athanassoulis, Manos  
Ailamaki, Anastasia  
Date Issued

2020-01-01

Publisher

SPRINGER

Published in
Vldb Journal
Volume

29

Issue

1

Start page

569

End page

591

Subjects

Computer Science, Hardware & Architecture

•

Computer Science, Information Systems

•

Computer Science

•

online tuning

•

adaptive indexing

•

logical partitioning

•

cracking

•

design

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units
DIAS  
Available on Infoscience
March 3, 2020
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/166769
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés