Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Journal articles
  4. LARGE-SCALE INFERENCE OF MULTIVARIATE REGRESSION FOR HEAVY-TAILED AND ASYMMETRIC DATA
 
research article

LARGE-SCALE INFERENCE OF MULTIVARIATE REGRESSION FOR HEAVY-TAILED AND ASYMMETRIC DATA

Song, Youngseok  
•
Zhou, Wen
•
Zhou, Wen-Xin
July 1, 2023
Statistica Sinica

Large-scale multivariate regression is a fundamental statistical tool with a wide range of applications. This study considers the problem of simultaneously testing a large number of general linear hypotheses, encompassing covariate-effect analysis, analysis of variance, and model comparisons. The challenge that accom-panies a large number of tests is the ubiquitous presence of heavy-tailed and/or highly skewed measurement noise, which is the main reason for the failure of con-ventional least squares-based methods. For large-scale multivariate regression, we develop a set of robust inference methods to explore data features such as heavy tailedness and skewness, which are not visible to least squares methods. The new testing procedure is based on the data-adaptive Huber regression and a new covari-ance estimator of regression estimates. Under mild conditions, we show that our methods produce consistent estimates of the false discovery proportion. Extensive numerical experiments and an empirical study on quantitative linguistics demon-strate the advantage of the proposed method over many state-of-the-art methods when the data are generated from heavy-tailed and/or skewed distributions.

  • Details
  • Metrics
Type
research article
DOI
10.5705/ss.202021.0003
Web of Science ID

WOS:001107595000020

Author(s)
Song, Youngseok  
Zhou, Wen
Zhou, Wen-Xin
Date Issued

2023-07-01

Publisher

Statistica Sinica

Published in
Statistica Sinica
Volume

33

Issue

3

Start page

1831

End page

1852

Subjects

Physical Sciences

•

General Linear Hypotheses

•

Heavy-Tailed And/Or Skewed Regression Errors

•

Huber Loss

•

Large-Scale Multiple Testing

•

Multivariate Regression

•

Quantitative Linguistics

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units
SDS  
FunderGrant Number

DOE

DE-SC0018344

NSF

DMS-1811376

NIH

R01GM144961

Available on Infoscience
February 23, 2024
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/205186
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés