Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. Same Queries, Different Data: Can we Predict Query Performance?
 
conference paper

Same Queries, Different Data: Can we Predict Query Performance?

Popescu, Adrian Daniel  
•
Ercegovac, Vuk
•
Balmin, Andrey
Show more
2012
Proceedings of the 7th International Workshop on Self Managing Database Systems
7th International Workshop on Self Managing Database Systems

We consider MapReduce workloads that are produced by analytics applications. In contrast to ad hoc query workloads, analytics applications are comprised of fixed data flows that are run over newly arriving data sets or on different portions of an existing data set. Examples of such workloads include document analysis/indexing, social media analytics, and ETL (Extract Transform Load). Motivated by these workloads, we propose a technique that predicts the runtime performance for a fixed set of queries running over varying input data sets. Our prediction technique splits each query into several segments where each segment’s performance is estimated using machine learning models. These per-segment estimates are plugged into a global analytical model to predict the overall query runtime. Our approach uses minimal statistics about the input data sets (e.g., tuple size, cardinality), which are complemented with historical information about prior query executions (e.g., execution time). We analyze the accuracy of predictions for several segment granularities on both standard analytical benchmarks such as TPC-DS [17], and on several real workloads. We obtain less than 25% prediction errors for 90% of predictions.

  • Files
  • Details
  • Metrics
Loading...
Thumbnail Image
Name

SMDB12_Popescu.pdf

Access type

openaccess

Size

318.27 KB

Format

Adobe PDF

Checksum (MD5)

77cce142a6aff00fe2796cd63e304a3d

Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés