Asymptotics of eigenstructure of sample correlation matrices for high-dimensional spiked models

David Morales-Jimenez, Iain M. Johnstone, Matthew R. McKay, Jeha Yang

Research output: Contribution to journalArticle

8 Citations (Scopus)

Abstract

Sample correlation matrices are widely used, but surprisingly little is known about their asymptotic spectral properties for high-dimensional data beyond the case of “null models”, for which the data is assumed to have independent coordinates. In the class of spiked models, we apply random matrix theory to derive asymptotic first-order and distributional results for both the leading eigenvalues and eigenvectors of sample correlation matrices, assuming a high dimensional regime in which the ratio p/n, of number of variables p to sample size n, converges to a positive constant. While the first order spectral properties of sample correlation matrices match those of sample covariance matrices, their asymptotic distributions can differ significantly. Indeed, the correlation-based fluctuations of both sample eigenvalues and eigenvectors are often remarkably smaller than those of their sample covariance counterparts.
LanguageEnglish
Pages1-42
Number of pages42
JournalStatistica Sinica
DOIs
Publication statusAccepted - 16 May 2019

Fingerprint

Correlation Matrix
High-dimensional
Eigenvalues and Eigenvectors
Spectral Properties
First-order
Sample Covariance Matrix
Model
Random Matrix Theory
High-dimensional Data
Asymptotic distribution
Asymptotic Properties
Null
Sample Size
Correlation matrix
Fluctuations
Converge

Cite this

@article{1a6ed91c4c2a4f868376fc031220e552,
title = "Asymptotics of eigenstructure of sample correlation matrices for high-dimensional spiked models",
abstract = "Sample correlation matrices are widely used, but surprisingly little is known about their asymptotic spectral properties for high-dimensional data beyond the case of “null models”, for which the data is assumed to have independent coordinates. In the class of spiked models, we apply random matrix theory to derive asymptotic first-order and distributional results for both the leading eigenvalues and eigenvectors of sample correlation matrices, assuming a high dimensional regime in which the ratio p/n, of number of variables p to sample size n, converges to a positive constant. While the first order spectral properties of sample correlation matrices match those of sample covariance matrices, their asymptotic distributions can differ significantly. Indeed, the correlation-based fluctuations of both sample eigenvalues and eigenvectors are often remarkably smaller than those of their sample covariance counterparts.",
author = "David Morales-Jimenez and Johnstone, {Iain M.} and McKay, {Matthew R.} and Jeha Yang",
year = "2019",
month = "5",
day = "16",
doi = "10.5705/ss.202019.0052",
language = "English",
pages = "1--42",
journal = "Statistica Sinica",
issn = "1017-0405",
publisher = "Institute of Statistical Science",

}

Asymptotics of eigenstructure of sample correlation matrices for high-dimensional spiked models. / Morales-Jimenez, David; Johnstone, Iain M.; McKay, Matthew R.; Yang, Jeha.

In: Statistica Sinica, 16.05.2019, p. 1-42.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Asymptotics of eigenstructure of sample correlation matrices for high-dimensional spiked models

AU - Morales-Jimenez, David

AU - Johnstone, Iain M.

AU - McKay, Matthew R.

AU - Yang, Jeha

PY - 2019/5/16

Y1 - 2019/5/16

N2 - Sample correlation matrices are widely used, but surprisingly little is known about their asymptotic spectral properties for high-dimensional data beyond the case of “null models”, for which the data is assumed to have independent coordinates. In the class of spiked models, we apply random matrix theory to derive asymptotic first-order and distributional results for both the leading eigenvalues and eigenvectors of sample correlation matrices, assuming a high dimensional regime in which the ratio p/n, of number of variables p to sample size n, converges to a positive constant. While the first order spectral properties of sample correlation matrices match those of sample covariance matrices, their asymptotic distributions can differ significantly. Indeed, the correlation-based fluctuations of both sample eigenvalues and eigenvectors are often remarkably smaller than those of their sample covariance counterparts.

AB - Sample correlation matrices are widely used, but surprisingly little is known about their asymptotic spectral properties for high-dimensional data beyond the case of “null models”, for which the data is assumed to have independent coordinates. In the class of spiked models, we apply random matrix theory to derive asymptotic first-order and distributional results for both the leading eigenvalues and eigenvectors of sample correlation matrices, assuming a high dimensional regime in which the ratio p/n, of number of variables p to sample size n, converges to a positive constant. While the first order spectral properties of sample correlation matrices match those of sample covariance matrices, their asymptotic distributions can differ significantly. Indeed, the correlation-based fluctuations of both sample eigenvalues and eigenvectors are often remarkably smaller than those of their sample covariance counterparts.

U2 - 10.5705/ss.202019.0052

DO - 10.5705/ss.202019.0052

M3 - Article

SP - 1

EP - 42

JO - Statistica Sinica

T2 - Statistica Sinica

JF - Statistica Sinica

SN - 1017-0405

ER -