000202533 001__ 202533
000202533 005__ 20181203023635.0
000202533 0247_ $$2doi$$a10.1145/2584665
000202533 022__ $$a1539-9087
000202533 02470 $$2ISI$$a000341390100017
000202533 037__ $$aARTICLE
000202533 245__ $$aDelite: A Compiler Architecture for Performance-Oriented Embedded Domain-Specific Languages
000202533 260__ $$bAssoc Computing Machinery$$c2014$$aNew York
000202533 269__ $$a2014
000202533 300__ $$a25
000202533 336__ $$aJournal Articles
000202533 520__ $$aDeveloping high-performance software is a difficult task that requires the use of low-level, architecture-specific programming models (e.g., OpenMP for CMPs, CUDA for GPUs, MPI for clusters). It is typically not possible to write a single application that can run efficiently in different environments, leading to multiple versions and increased complexity. Domain-Specific Languages (DSLs) are a promising avenue to enable programmers to use high-level abstractions and still achieve good performance on a variety of hardware. This is possible because DSLs have higher-level semantics and restrictions than general-purpose languages, so DSL compilers can perform higher-level optimization and translation. However, the cost of developing performance-oriented DSLs is a substantial roadblock to their development and adoption. In this article, we present an overview of the Delite compiler framework and the DSLs that have been developed with it. Delite simplifies the process of DSL development by providing common components, like parallel patterns, optimizations, and code generators, that can be reused in DSL implementations. Delite DSLs are embedded in Scala, a general-purpose programming language, but use metaprogramming to construct an Intermediate Representation (IR) of user programs and compile to multiple languages (including C++, CUDA, and OpenCL). DSL programs are automatically parallelized and different parts of the application can run simultaneously on CPUs and GPUs. We present Delite DSLs for machine learning, data querying, graph analysis, and scientific computing and show that they all achieve performance competitive to or exceeding C++ code.
000202533 6531_ $$aLanguages
000202533 6531_ $$aPerformance
000202533 6531_ $$aDomain-specific languages
000202533 6531_ $$amultistage programming
000202533 6531_ $$alanguage virtualization
000202533 6531_ $$acode generation
000202533 700__ $$uStanford Univ, Stanford, CA 94305 USA$$aSujeeth, Arvind K.
000202533 700__ $$uStanford Univ, Stanford, CA 94305 USA$$aBrown, Kevin J.
000202533 700__ $$uStanford Univ, Stanford, CA 94305 USA$$aLee, Hyoukjoong
000202533 700__ $$0243345$$g185682$$uOracle Labs, Santa Clara, CA USA$$aRompf, Tiark
000202533 700__ $$uStanford Univ, Stanford, CA 94305 USA$$aChafi, Hassan
000202533 700__ $$0241835$$g126003$$aOdersky, Martin
000202533 700__ $$aOlukotun, Kunle$$uStanford Univ, Stanford, CA 94305 USA
000202533 773__ $$j13$$tAcm Transactions On Embedded Computing Systems
000202533 909C0 $$xU10409$$0252187$$pLAMP
000202533 909CO $$pIC$$particle$$ooai:infoscience.tind.io:202533
000202533 917Z8 $$x166927
000202533 937__ $$aEPFL-ARTICLE-202533
000202533 973__ $$rREVIEWED$$sPUBLISHED$$aEPFL
000202533 980__ $$aARTICLE