On multivariate calibration with unlabeled data

Gujral, Paman; Amrhein, Michael; Ergon, Rolf; Wise, Barry; Bonvin, Dominique

doi:10.1002/cem.1389

Gujral, Paman; Amrhein, Michael; Ergon, Rolf; Wise, Barry; Bonvin, Dominique

2011

Formats

Format
BibTeX
MARC
MARCXML
DublinCore
EndNote
NLM
RefWorks
RIS

Abstract

In principal component regression (PCR) and partial least-squares regression (PLSR), the use of unlabeled data, in addition to labeled data, helps stabilize the latent subspaces in the calibration step, typically leading to a lower prediction error. A non-sequential approach based on optimal filtering (OF) has been proposed in the literature to use unlabeled data with PLSR. In this work, a sequential version of the OF-based PLSR and a PCA-based PLSR (PLSR applied to PCA-preprocessed data) are proposed. It is shown analytically that the sequential version of the OF-based PLSR is equivalent to PCA-based PLSR, which leads to a new interpretation of OF. Simulated and experimental data sets are used to point out the usefulness and pitfalls of using unlabeled data. Unlabeled data can replace labeled data to some extent, thereby leading to an economic benefit. However, in the presence of drift, the use of unlabeled data can result in an increase in prediction error compared to that obtained with a model based on labeled data alone.

Details

Title On multivariate calibration with unlabeled data

Author(s) Gujral, Paman ; Amrhein, Michael ; Ergon, Rolf ; Wise, Barry ; Bonvin, Dominique

Published in Journal of Chemometrics

Volume 25

Issue 8

Pages 456-465

Date 2011

Publisher Wiley-Blackwell

ISSN 0886-9383

Keywords

multivariate calibration; semi-supervised learning; unlabeled data; optimal filtering; drift

DOI https://doi.org/10.1002/cem.1389

Other identifier(s) View record in Web of Science

Laboratories LA

Record Appears in Scientific production and competences > STI - School of Engineering > IGM - Institute of Mechanical Engineering > LA - Automatic Control Laboratory
Peer-reviewed publications
Work produced at EPFL
Journal Articles
Published

Record creation date 2010-12-24

Abstract

Details

Actions