Staged parser combinators for efficient data processing

Parsers are ubiquitous in computing, and many applications depend on their performance for decoding data efficiently. Parser combinators are an intuitive tool for writing parsers: tight integration with the host language enables grammar specifications to be interleaved with processing of parse results. Unfortunately, parser combinators are typically slow due to the high overhead of the host language abstraction mechanisms that enable composition. We present a technique for eliminating such overhead. We use staging, a form of runtime code generation, to dissociate input parsing from parser composition, and eliminate intermediate data structures and computations associated with parser composition at staging time. A key challenge is to maintain support for input dependent grammars, which have no clear stage distinction. Our approach applies to top-down recursive-descent parsers as well as bottom-up nondeterministic parsers with key applications in dynamic programming on sequences, where we auto-generate code for parallel hardware. We achieve performance comparable to specialized, hand-written parsers.

Published in:
Proceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages & Applications - OOPSLA '14, 637-653
Presented at:
Object Oriented Programming Systems Languages & Applications (OOPSLA), Portland, Oregon, USA, 20-24 October 2014
New York, New York, USA, ACM Press

 Record created 2014-11-10, last modified 2018-03-17

Publisher's version:
Download fulltext

Rate this document:

Rate this document:
(Not yet reviewed)