Nonparametric Estimation of Data Dimensionality Prior to Data Compression: the case of the Human Development Index

D. Canning, Declan French, M. Moore

Research output: Contribution to journalArticlepeer-review

3 Citations (Scopus)

Abstract

In many applications in applied statistics researchers reduce the complexity of a data set by combining a group of variables into a single measure using factor analysis or an index number. We argue that such compression loses information if the data actually has high dimensionality. We advocate the use of a non-parametric estimator, commonly used in physics (the Takens estimator), to estimate the correlation dimension of the data prior to compression. The advantage of this approach over traditional linear data compression approaches is that the data does not have to be linearized. Applying our ideas to the United Nations Human Development Index we find that the four variables that are used in its construction have dimension three and the index loses information.
Original languageEnglish
Pages (from-to)1853-1863
JournalJournal of Applied Statistics
Volume40
Issue number9
Early online date16 May 2013
DOIs
Publication statusPublished - Sep 2013

Keywords

  • development
  • well-being
  • dimension
  • measure
  • indicator

ASJC Scopus subject areas

  • Statistics and Probability
  • Statistics, Probability and Uncertainty

Fingerprint

Dive into the research topics of 'Nonparametric Estimation of Data Dimensionality Prior to Data Compression: the case of the Human Development Index'. Together they form a unique fingerprint.

Cite this