000186322 001__ 186322
000186322 005__ 20190316235629.0
000186322 0247_ $$2doi$$a10.14778/2536360.2536364
000186322 037__ $$aCONF
000186322 245__ $$aSharing Data and Work Across Concurrent Analytical Queries
000186322 269__ $$a2013
000186322 260__ $$c2013
000186322 336__ $$aConference Papers
000186322 500__ $$aSYSTEMS PUBLICATION_SHORE_MT
000186322 520__ $$aToday's data deluge enables organizations to collect massive data, and analyze it with an ever-increasing number of concurrent queries. Traditional data warehouses (DW) face a challenging problem in executing this task, due to their query-centric model: each query is optimized and executed independently. This model results in high contention for resources. Thus, modern DW depart from the query-centric model to execution models involving sharing of common data and work. Our goal is to show when and how a DW should employ sharing. We evaluate experimentally two sharing methodologies, based on their original prototype systems, that exploit work sharing opportunities among concurrent queries at run-time: Simultaneous Pipelining (SP), which shares intermediate results of common sub-plans, and Global Query Plans (GQP), which build and evaluate a single query plan with shared operators. First, after a short review of sharing methodologies, we show that SP and GQP are orthogonal techniques. SP can be applied to shared operators of a GQP, reducing response times by 20%-48% in workloads with numerous common sub-plans. Second, we corroborate previous results on the negative impact of SP on performance for cases of low concurrency. We attribute this behavior to a bottleneck caused by the push-based communication model of SP. We show that pull-based communication for SP eliminates the overhead of sharing altogether for low concurrency, and scales better on multi-core machines than push-based SP, further reducing response times by 82%-86% for high concurrency. Third, we perform an experimental analysis of SP, GQP and their combination, and show when each one is beneficial. We identify a trade-off between low and high concurrency. In the former case, traditional query-centric operators with SP perform better, while in the latter case, GQP with shared operators enhanced by SP give the best results.
000186322 700__ $$0245684$$aPsaroudakis, Iraklis$$g200442
000186322 700__ $$0243529$$aAthanassoulis, Manos$$g188175
000186322 700__ $$0243527$$aAilamaki, Anastasia$$g177957
000186322 7112_ $$a39th International Conference on Very Large Data Bases
000186322 773__ $$tProceedings of the 39th International Conference on Very Large Data Bases
000186322 8564_ $$s227228$$uhttps://infoscience.epfl.ch/record/186322/files/p519-psaroudakis.pdf$$yPublisher's version$$zPublisher's version
000186322 8564_ $$s4167506$$uhttps://infoscience.epfl.ch/record/186322/files/paper.pdf$$yCorrection of legend of Figure 6c$$zCorrection of legend of Figure 6c
000186322 909C0 $$0252224$$pDIAS$$xU11836
000186322 909CO $$ooai:infoscience.tind.io:186322$$pconf$$pIC$$qGLOBAL_SET
000186322 917Z8 $$x188175
000186322 917Z8 $$x190851
000186322 917Z8 $$x188175
000186322 917Z8 $$x188175
000186322 917Z8 $$x188175
000186322 917Z8 $$x188175
000186322 917Z8 $$x188175
000186322 917Z8 $$x188175
000186322 917Z8 $$x200442
000186322 917Z8 $$x200442
000186322 937__ $$aEPFL-CONF-186322
000186322 973__ $$aEPFL$$rREVIEWED$$sPUBLISHED
000186322 980__ $$aCONF