Generating Data Augmented Spectroscopic Data For Performance Enhancement

Research output: Contribution to conferencePosterpeer-review


The application of chemometrics in food science has revolutionized the field by allowing the creation of models able to automate a broad range of applications such as food authenticity and food fraud detection. In order to create effective and general models able to address the complexity of real life problems, a vast amount of varied training samples are required. Training dataset has to cover all possible types of sample and instrument variability. However, acquiring a varied amount of samples is a time consuming and costly process, in which collecting samples representative of the real world variation is not always possible, specially in some application fields. To address this problem, a novel framework for the application of data augmentation techniques to spectroscopic data has been designed and implemented. This is a carefully designed pipeline of four complementary and independent blocks which can be finely tuned depending on the desired variance for enhancing model's robustness: a) blending spectra, b) changing baseline, c) shifting along x axis, and d) adding random noise.
This novel data augmentation solution has been tested in order to obtain highly efficient generalised classification model based on spectroscopic data. Fourier transform mid-infrared (FT-IR) spectroscopic data of eleven pure vegetable oils (106 admixtures) for the rapid identification of vegetable oil species in mixtures of oils have been used as a case study to demonstrate the influence of this pioneering approach in chemometrics, obtaining a 10% improvement in classification which is crucial in some applications of food adulteration.

Original languageEnglish
Publication statusPublished - 20 Jun 2016
EventChemometrics in Analytical Chemistry - Barcelona, Spain
Duration: 06 Jun 201610 Jun 2016
Conference number: XVI


ConferenceChemometrics in Analytical Chemistry
Abbreviated titleCAC 2016
Internet address


Dive into the research topics of 'Generating Data Augmented Spectroscopic Data For Performance Enhancement'. Together they form a unique fingerprint.

Cite this