A multi-view deep neural network model for chemical-disease relation extraction from imbalanced datasets

Sayantan Mitra*, Sriparna Saha, Mohammed Hasanuzzaman

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

20 Citations (Scopus)

Abstract

Understanding the chemical-disease relations (CDR) is a crucial task in various biomedical domains. Manual mining of these information from biomedical literature is costly and time-consuming. To address these issues, various researches have been carried out to design an efficient automatic tool. In this paper, we propose a multi-view based deep neural network model for CDR task. Typically, multiple representations (or views) of the datasets are not available for this task. So, we train multiple conceptually different deep neural network models on the dataset to generate different abstract features, treated as different views. A novel loss function, 'Penalized LF', is defined to address the problem of imbalance dataset. The proposed loss function is generic in nature. The model is designed as a combination of Convolution Neural Network (CNN) and Bidirectional Long Short Term Memory (Bi-LSTM) network along with a Multi-Layer Perceptron (MLP). To show the efficacy of our proposed model, we have compared it with six baseline models and other state-of-the-art techniques, on 'chemicals-and-disease-DFE' dataset, a free text dataset created by Li et al. from BioCreative V Chemical Disease Relation dataset. Results show that the proposed model attains highest F1-score for individual classes, proving its efficiency in handling class imbalance problem in the dataset. To further demonstrate the efficacy of the proposed model, we have presented results on BioCreative V dataset and two Protein-Protein Interaction Identification (PPI) datasets, viz., AiMed and BioInfer. All these results are also compared with the state-of-the-art models.

Original languageEnglish
Pages (from-to)3315-3325
Number of pages11
JournalIEEE Journal of Biomedical and Health Informatics
Volume24
Issue number11
Early online date30 Mar 2020
DOIs
Publication statusPublished - Nov 2020
Externally publishedYes

Bibliographical note

Funding Information:
Manuscript received August 20, 2019; revised January 30, 2020 and March 5, 2020; accepted March 23, 2020. Date of publication March 30, 2020; date of current version November 5, 2020. This work was supported by Young Faculty Research Fellowship (YFRF) Award. (Corresponding author: Sayantan Mitra.) Sayantan Mitra and Sriparna Saha are with the Department of Computer Science, Indian Institute of Technology Patna, Bihta 801103, India (e-mail: [email protected]; [email protected]).

Publisher Copyright:
© 2013 IEEE.

Keywords

  • chemical-disease relations
  • imbalanced class
  • Multi-view classification
  • relation extraction
  • text mining

ASJC Scopus subject areas

  • Biotechnology
  • Computer Science Applications
  • Electrical and Electronic Engineering
  • Health Information Management

Fingerprint

Dive into the research topics of 'A multi-view deep neural network model for chemical-disease relation extraction from imbalanced datasets'. Together they form a unique fingerprint.

Cite this