Scaling up analytical queries with column-stores

As data analytics is used by an increasing number of applications, data analytics engines are required to execute workloads with increased concurrency, i.e., an increasing number of clients submitting queries. Data management systems designed for data analytics - a market dominated by column-stores - however, were initially optimized for single query execution, minimizing its response time. Hence, they do not treat concurrency as a first class citizen. In this paper, we experiment with one open-source and two commercial column-stores using the TPC-H and SSB benchmarks in a setup with an increasing number of concurrent clients submitting queries, focusing on whether the tested systems can scale up in a single node instance. The tested systems for in-memory workloads scale up, to some degree; however, when the server is saturated they fail to fully exploit the available parallelism. Further, we highlight the unpredictable response times for high concurrency.

Published in:
Proceedings of the 6th International Workshop on Testing Database Systems
Presented at:
6th International Workshop on Testing Database Systems, New York, NY, USA, June 24, 2013

 Record created 2013-04-29, last modified 2018-03-17

Publisher's version:
Download fulltextPDF
External link:
Download fulltextURL
Rate this document:

Rate this document:
(Not yet reviewed)