Non-regular Inference: Universal Inference and Discrete Profiling
Non-regular or irregular statistical problems are those that do not satisfy a set of standard regularity conditions that allow useful theoretical properties of inferential procedures to be proven. Non-regular problems are prevalent; a classical example is the Gaussian mixture model, while more recently, the advent of machine learning has introduced models that are highly non-regular as well as black-box. In this thesis, we make some contributions to non-regular statistical inference. Our framework is likelihood-based; we give an overview of likelihood theory in Chapter 1.
In Chapter 2, we study universal inference, a method proposed by~\citet{Wasserman16880} that can construct finite-sample level $\alpha$ tests with minimal regularity conditions. We identify three sources of the resulting loss of power in the normal case, as a trade-off to this great generality. We show that universal inference becomes catastrophically conservative as the number of nuisance parameters grows, and propose a correction factor to mitigate this conservativeness while maintaining finite-sample level $\alpha$ error control. We demonstrate the viability of the correction factor on the non-regular problem of testing for the number of components in a two-component Gaussian mixture model. We also study the $K$-fold variant of universal inference and caution against using certain splits that lead to degenerate statistics.
In Chapter 3, we apply universal inference to construct model confidence sets with finite-sample coverage guarantees, which we dub universal model confidence sets (UMCS). We study the asymptotic properties of UMCS and establish its ability to include true and correct models and exclude wrong ones. We examine the application of the quasi-reverse information projection (qRIPR) to mitigate the conservativeness of UMCS, and study some cases where the application of qRIPR maintains the e-value property of the universal inference statistics, a property central to its error-controlling feature. We test the performance of UMCS to pick out signal covariates on a high-dimensional gene example.
In Chapter 4 we study discrete profiling, an extension of profile likelihood through the introduction of discrete nuisance parameters that index different functional forms modelling uncertainty. We extend the phenomenon of an observed bias in mis-specified normal linear models to a general one using asymptotic theory, and examine the ability of the discrete profiling algorithm to asymptotically detect mis-specified and slightly mis-specified models. We derive an expanded form for the discrete profile likelihood statistic and study its asymptotic properties under different cases of mis-specification. We corroborate our theory on the task of finding the best Student's $t$ density to model a normal density.
We conclude in Chapter 5 and discuss directions for future work.
École Polytechnique Fédérale de Lausanne
Prof. Mats Julius Stensrud (président) ; Prof. Anthony Christopher Davison, Prof. Victor Panaretos (directeurs) ; Dr Linda Mhalla, Prof. Alessandra Brazzale, Prof. Valérie Chavez (rapporteurs)
2025
Lausanne
2025-07-09
10780
149