Synthesizing data using variational autoencoders for handling class imbalanced deep learning

Taimoor Shakeel Sheikh*, Adil Khan, Muhammad Fahim, Muhammad Ahmad

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper addresses the complex problem of learning from unbalanced datasets due to which traditional algorithms may perform poorly. Classification algorithms used for learning tend to favor the larger, less important classes in such problems. In this work, to handle unbalanced data problem, we synthesize data using variational autoencoders (VAE) on raw training samples and then, use various input sources (raw, combination of raw and synthetic) to train different models. We evaluate our method using multiple criteria on SVHN dataset which consists of complex images, and perform a comprehensive comparative analysis of popular CNN architectures when there is balanced and unbalanced data and determine which operates best in class imbalance problem. We found that data synthesis via VAE is reliable and robust, and can help to classify real data with higher accuracy than traditional (unbalanced) data. Our results demonstrate the strength of using VAE to solve the class imbalance problem.

Original languageEnglish
Title of host publicationAnalysis of Images, Social Networks and Texts - 8th International Conference, AIST 2019, Revised Selected Papers: Proceedings
EditorsWil M.P. van der Aalst, Vladimir Batagelj, Dmitry I. Ignatov, Valentina Kuskova, Sergei O. Kuznetsov, Irina A. Lomazova, Michael Khachay, Andrey Kutuzov, Natalia Loukachevitch, Amedeo Napoli, Panos M. Pardalos, Marcello Pelillo, Andrey V. Savchenko, Elena Tutubalina
PublisherSpringer
Pages270-281
Number of pages12
ISBN (Print)9783030395742
DOIs
Publication statusPublished - 02 Feb 2020
Externally publishedYes
Event8th International Conference on Analysis of Images, Social Networks and Texts, AIST 2019 - Kazan, Russian Federation
Duration: 17 Jul 201919 Jul 2019

Publication series

NameCommunications in Computer and Information Science
Volume1086CCIS
ISSN (Print)1865-0929
ISSN (Electronic)1865-0937

Conference

Conference8th International Conference on Analysis of Images, Social Networks and Texts, AIST 2019
Country/TerritoryRussian Federation
CityKazan
Period17/07/201919/07/2019

Bibliographical note

Publisher Copyright:
© Springer Nature Switzerland AG 2020.

Copyright:
Copyright 2020 Elsevier B.V., All rights reserved.

Keywords

  • Convolutional Neural Network (CNN)
  • Imbalanced data
  • Variational autoencoder (VAE)

ASJC Scopus subject areas

  • Computer Science(all)
  • Mathematics(all)

Fingerprint

Dive into the research topics of 'Synthesizing data using variational autoencoders for handling class imbalanced deep learning'. Together they form a unique fingerprint.

Cite this