Sample correlation matrices are widely used, but surprisingly little is known about their asymptotic spectral properties for high-dimensional data beyond the case of “null models”, for which the data is assumed to have independent coordinates. In the class of spiked models, we apply random matrix theory to derive asymptotic first-order and distributional results for both the leading eigenvalues and eigenvectors of sample correlation matrices, assuming a high dimensional regime in which the ratio p/n, of number of variables p to sample size n, converges to a positive constant. While the first order spectral properties of sample correlation matrices match those of sample covariance matrices, their asymptotic distributions can differ significantly. Indeed, the correlation-based fluctuations of both sample eigenvalues and eigenvectors are often remarkably smaller than those of their sample covariance counterparts.
Morales-Jimenez, D., Johnstone, I. M., McKay, M. R., & Yang, J. (Accepted/In press). Asymptotics of eigenstructure of sample correlation matrices for high-dimensional spiked models. Statistica Sinica, 1-42. https://doi.org/10.5705/ss.202019.0052