Methodological Advances in Causal Inference: Experimentation, Identification and Estimation
Causal inference provides a powerful framework for reasoning and decision-making. However, much of its machinery hinges on assumptions that might fail in real-world applications---such as parallel trends, full observability, and known causal structure. This thesis aims to develop new causal methodologies in order to extend the boundaries of what is possible when those assumptions are violated, with contributions across identification theory, semiparametric estimation, algorithmic experiment design, and structure learning.
We begin by developing methods for causal inference in settings with panel and repeated cross-sectional data. Building on the difference-in-differences (DiD) framework, we formalize the identification strategy for the triple difference framework and introduce a class of robust and efficient semiparametric estimators compatible with machine learning-based nuisance function estimators. We then generalize the classical changes-in-changes model to accommodate the triple difference setting, enabling identification of potential outcome distributions, even in settings with high-dimensional outcome variables.
Next, we turn to the challenge of designing experiments for identifying a causal estimand of interest. The existing identification theory answers the question of whether or not the causal query is identifiable using the data at hand. When an effect is not identifiable with the available data, rather than stopping there, the natural next question becomes: What additional data or interventions would make it identifiable? We study the problem of designing the optimal (minimum-cost) interventions to make identification feasible. In parallel, we introduce a new framework for causal effect identification under uncertain causal graphs---such as those learned from data with varying confidence over edges---offering a principled way to reason about identifiability when structure is not known with certainty.
Finally, we address causal discovery in settings with unobserved confounding, selection bias, and nonlinear dependencies. First, we propose L-MARVEL, a recursive, constraint-based discovery algorithm that is both sound and complete and, achieves the tightest known bounds on the number of required conditional independence tests. Then, we present a new transport-based discovery method using monotone triangular maps, which allows causal structures to be inferred from observational data without relying on strong functional form assumptions.
EPFL
Prof. Alexandre Massoud Alahi (président) ; Prof. Negar Kiyavash (directeur de thèse) ; Prof. Mats Stensrud, Prof. Robin Evans, Prof. Qingyuan Zhao (rapporteurs)
2025
Lausanne
2025-10-10
10886
310