Abstract

Adaptive data analysis is known to introduce bias in reported measurements. Russo and Zou [1] recently introduced an information-theoretic framework to study this problem. Herein, this framework is adopted and new dependence measures are introduced to bound the exploration bias. When the measurements have bounded L1− or L2 -norms, or when the selection procedure is symmetric, the new bounds are such that the contribution of the selection procedure to the bias is decoupled from the effects of the underlying distribution generating the data, thus enabling direct comparisons between different selection procedures.

Details