Formal Verification of Memoized Backtracking for JavaScript Regexes
In the context of regular expressions, many techniques have been developed to accelerate the matching process. Backtracking engines are widespread today as they support all extended regex features and exhibit better performance in most use cases. However, they are subject to ReDoS attacks due to their exponential worst-case complexity. To mitigate this risk, memoization is one solution. It involves recording the results of a recursive function to avoid redundant computations. Most JavaScript’s engines are based on a backtracking algorithm, which could thus benefit from memoization. Currently, no existing optimizations in this direction are present in the various implementations of JS engines.
This study formalizes a memoized backtracking algorithm for a subset of JS regexes. The initial implementation covers a simple backtracking algorithm. Then, a memoized version is presented with a subsequent proof of equivalence between the first algorithm and the memoized version. During the process, core properties are presented concerning the memoization table. Future directions include verifying equivalence with ECMAScript semantics, proving termination, and extending capabilities to encompass more JS regex features. This research aims to enhance understanding and practical application of regex optimization techniques in JS environments.
Formal Verification of Memoized Backtracking for JavaScript Regexes.pdf
Main Document
openaccess
N/A
1.66 MB
Adobe PDF
966d5431a7f5deccfb8064d346fbcc2d