Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. Sharing Data and Work Across Concurrent Analytical Queries
 
conference paper

Sharing Data and Work Across Concurrent Analytical Queries

Psaroudakis, Iraklis  
•
Athanassoulis, Manos  
•
Ailamaki, Anastasia  
2013
Proceedings of the 39th International Conference on Very Large Data Bases
39th International Conference on Very Large Data Bases

Today's data deluge enables organizations to collect massive data, and analyze it with an ever-increasing number of concurrent queries. Traditional data warehouses (DW) face a challenging problem in executing this task, due to their query-centric model: each query is optimized and executed independently. This model results in high contention for resources. Thus, modern DW depart from the query-centric model to execution models involving sharing of common data and work. Our goal is to show when and how a DW should employ sharing. We evaluate experimentally two sharing methodologies, based on their original prototype systems, that exploit work sharing opportunities among concurrent queries at run-time: Simultaneous Pipelining (SP), which shares intermediate results of common sub-plans, and Global Query Plans (GQP), which build and evaluate a single query plan with shared operators. First, after a short review of sharing methodologies, we show that SP and GQP are orthogonal techniques. SP can be applied to shared operators of a GQP, reducing response times by 20%-48% in workloads with numerous common sub-plans. Second, we corroborate previous results on the negative impact of SP on performance for cases of low concurrency. We attribute this behavior to a bottleneck caused by the push-based communication model of SP. We show that pull-based communication for SP eliminates the overhead of sharing altogether for low concurrency, and scales better on multi-core machines than push-based SP, further reducing response times by 82%-86% for high concurrency. Third, we perform an experimental analysis of SP, GQP and their combination, and show when each one is beneficial. We identify a trade-off between low and high concurrency. In the former case, traditional query-centric operators with SP perform better, while in the latter case, GQP with shared operators enhanced by SP give the best results.

  • Files
  • Details
  • Metrics
Type
conference paper
DOI
10.14778/2536360.2536364
Author(s)
Psaroudakis, Iraklis  
Athanassoulis, Manos  
Ailamaki, Anastasia  
Date Issued

2013

Published in
Proceedings of the 39th International Conference on Very Large Data Bases
Note

SYSTEMS PUBLICATION_SHORE_MT

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units
DIAS  
Event name
39th International Conference on Very Large Data Bases
Available on Infoscience
May 1, 2013
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/91909
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés