Procrustes Metrics and Optimal Transport for Covariance Operators

Masarotto, Valentina

doi:10.5075/epfl-thesis-9418

Masarotto, Valentina

2019

Download

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

Covariance operators play a fundamental role in functional data analysis, providing the canonical means to analyse functional variation via the celebrated Karhunen-Loève expansion. These operators may themselves be subject to variation, for instance in contexts where multiple functional populations are to be compared. Statistical techniques to analyse such variation are intimately linked with the choice of metric on the space of such operators, as well as with their intrinsic infinite-dimensionality. We will show that we can identify the space of infinite-dimensional covariance operators equipped with the Procrustes size-and-shape metric from shape theory, with that of centred Gaussian processes, equipped with the Wasserstein metric of optimal transportation. We then describe key geometrical and topological aspects of the space of covariance operators endowed with the Procrustes metric. Through the notion of multicoupling of Gaussian measures, we establish existence, uniqueness and stability for the Fréchet mean of covariance operators with respect to the Procrustes metric. Furthermore, we will provide generative models that are canonical for such metric. We then turn to the problem of comparing several samples of stochastic processes with respect to their second-order structure, and we subsequently describe the main modes of variation in this second order structure. These two tasks are carried out via an Analysis of Variance (ANOVA) and a Principal Component Analysis (PCA) of covariance operators respectively. In order to perform ANOVA, we introduce a novel approach based on optimal (multi)transport and identify each covariance with an optimal transport map. These maps are then contrasted with the identity with respect to a norm-induced distance. The resulting test statistic, calibrated by permutation, outperforms the state-of-the-art in the functional case. If the null hypothesis postulating equality of the operators is rejected, thanks to a geometric interpretation of the transport maps we can construct a PCA on the tangent space with the aim of understanding the sample variability. Finally, we provide a further example of use of the optimal transport framework, by applying it to the problem of clustering of operators. Two different clustering algorithms are presented, one of which is innovative. The transportation ANOVA, PCA and clustering are validated both on simulated scenarios and real dataset.

Details

Title Procrustes Metrics and Optimal Transport for Covariance Operators

Author(s) Masarotto, Valentina

Advisor(s)

Panaretos, Victor

Pagination 139

Date 2019

Publisher Lausanne, EPFL

Keywords

Fréchet mean; functional data analysis; functional ANOVA; functional PCA; optimal transportation; phase variation; Procrustes distance; Wasserstein distance

Language English

DOI https://doi.org/10.5075/epfl-thesis-9418

Laboratories SMAT

Record Appears in Scientific production and competences > SB - School of Basic Sciences > MATH - Institute of Mathematics > SMAT - Chair of Mathematical Statistics
Scientific production and competences > SB - School of Basic Sciences > Mathematics
Scientific production and competences > EPFL Theses
Work produced at EPFL
Published
Theses

Record creation date 2019-05-08

Files

Abstract

Details

PDF