Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. Boosting Efficiency of External Pipelines by Blurring Application Boundaries
 
conference paper not in proceedings

Boosting Efficiency of External Pipelines by Blurring Application Boundaries

Herlihy, Anna  
•
Chrysogelos, Periklis  
•
Ailamaki, Anastasia  
2022
12th Annual Conference on Innovative Data Systems Research (CIDR ’22)

Modern application development addresses increasingly specialized problems using domain-specific utilities, such as Optical Code Recognition and standalone statistical tools. The diversity of tooling, combined with the ever-growing volume of data, requires data pipelines to be both efficient and support a variety of data processing tools within the same pipeline. Existing approaches, however, impose a tradeoff between modularity and performance: on the one hand, data processing systems are specialized for fast execution of complex queries, favoring efficiency at the expense of high development costs and required domain expertise. On the other hand, highly extensible systems opt for composability at the expense of inefficient execution due to minimal assumptions about input and output formats. This paper proposes Generalized OLAP (GOLAP), a new DBMS paradigm that places automatic extensibility of functionality as a first-class design goal. GOLAP ingests external utilities to achieve the functionality provided by external modular data pipelines while maintaining the performance of natively optimized DBMS functions. Through a combination of runtime inspection and static analysis, GOLAP detects inter-utility communication inefficiencies and parallelization opportunities beyond the limits of isolated utility optimizations. It then modifies the utilities to elide unnecessary inter-utility operations and parallelizes the pipeline to increase hardware utilization. To evaluate GOLAP, we build Caesar, a prototype that optimizes simple pipelines, showing up to 22x speedup while introducing a limited instrumentation period with a slowdown of less than 17%.

  • Files
  • Details
  • Metrics
Loading...
Thumbnail Image
Name

p81-herlihy.pdf

Type

Publisher

Version

http://purl.org/coar/version/c_970fb48d4fbd8a85

Access type

openaccess

License Condition

CC BY

Size

671.43 KB

Format

Adobe PDF

Checksum (MD5)

9b5cd1201158f61ac8cb1fd6726710d2

Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés