Enhanced Label Noise Filtering with Multiple Voting

Donghai Guan, Maqbool Hussain, Weiwei Yuan, Asad Masood Khattak, Muhammad Fahim, Wajahat Ali Khan*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Downloads (Pure)

Abstract

Label noises exist in many applications, and their presence can degrade learning performance. Researchers usually use filters to identify and eliminate them prior to training. The ensemble learning based filter (EnFilter) is the most widely used filter. According to the voting mechanism, EnFilter is mainly divided into two types: single-voting based (SVFilter) and multiple-voting based (MVFilter). In general, MVFilter is more often preferred because multiple-voting could address the intrinsic limitations of single-voting. However, the most important unsolved issue in MVFilter is how to determine the optimal decision point (ODP). Conceptually, the decision point is a threshold value, which determines the noise detection performance. To maximize the performance of MVFilter, we propose a novel approach to compute the optimal decision point. Our approach is data driven and cost sensitive, which determines the ODP based on the given noisy training dataset and noise misrecognition cost matrix. The core idea of our approach is to estimate the mislabeled data probability distributions, based on which the expected cost of each possible decision point could be inferred. Experimental results on a set of benchmark datasets illustrate the utility of our proposed approach.

Original languageEnglish
Article number5031
JournalApplied Sciences (Switzerland)
Volume9
Issue number23
DOIs
Publication statusPublished - 21 Nov 2019
Externally publishedYes

Bibliographical note

Funding Information:
This research was supported by the Natural Science Foundation of China (Grant No. 61672284), the Natural Science Foundation of Jiangsu Province (Grant No. BK20171418), the China Postdoctoral Science Foundation (Grant No. 2016M591841), and the Jiangsu Planned Projects for Postdoctoral Research Funds (No. 1601225C). This research was also supported by the Defense Industrial Technology Development Program under Grant No. JCKY2016605B006. Furthermore, this research work was supported by the Zayed University Research Cluster Award # R18038. This research was also supported by the National Research Foundation (NRF) of Korea (NRF-2019R1G1A1011296).

Publisher Copyright:
© 2019 by the authors.

Copyright:
Copyright 2019 Elsevier B.V., All rights reserved.

Keywords

  • Cost minimization
  • Mislabeled data filter
  • Multiple-voting
  • Optimal decision point
  • Single-voting

ASJC Scopus subject areas

  • Materials Science(all)
  • Instrumentation
  • Engineering(all)
  • Process Chemistry and Technology
  • Computer Science Applications
  • Fluid Flow and Transfer Processes

Fingerprint

Dive into the research topics of 'Enhanced Label Noise Filtering with Multiple Voting'. Together they form a unique fingerprint.

Cite this