Nearest clusters based partial least squares discriminant analysis for the classification of spectral data

Weiran Song, Hui Wang, Paul Maguire, Omar Nibouche

Research output: Contribution to journalArticlepeer-review

36 Citations (Scopus)


Partial Least Squares Discriminant Analysis (PLS-DA) is one of the most effective multivariate analysis methods for spectral data analysis, which extracts latent variables and uses them to predict responses. In particular, it is an effective method for handling high-dimensional and collinear spectral data. However, PLS-DA does not explicitly address data multimodality, i.e., within-class multimodal distribution of data. In this paper, we present a novel method termed nearest clusters based PLS-DA (NCPLS-DA) for addressing the multimodality and nonlinearity issues explicitly and improving the performance of PLS-DA on spectral data classification. The new method applies hierarchical clustering to divide samples into clusters and calculates the corresponding centre of every cluster. For a given query point, only clusters whose centres are nearest to such a query point are used for PLS-DA. Such a method can provide a simple and effective tool for separating multimodal and nonlinear classes into clusters which are locally linear and unimodal. Experimental results on 17 datasets, including 12 UCI and 5 spectral datasets, show that NCPLS-DA can outperform 4 baseline methods, namely, PLS-DA, kernel PLS-DA, local PLS-DA and k-NN, achieving the highest classification accuracy most of the time.
Original languageEnglish
Pages (from-to)27-38
Number of pages12
JournalAnalytica Chimica Acta
Early online date06 Feb 2018
Publication statusPublished - 07 Jun 2018
Externally publishedYes


  • Partial Least Squares
  • Clustering
  • Nonlinearity
  • Multimodality
  • Spectral pattern recognition.


Dive into the research topics of 'Nearest clusters based partial least squares discriminant analysis for the classification of spectral data'. Together they form a unique fingerprint.

Cite this