Centrality-Based Approach for Supervised Term Weighting

Niloofer Shanavas, Hui Wang, Zhiwei Lin, Glenn Hawe

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Citations (Scopus)

Abstract

The huge amount of text documents has made the manual organization of text data a tedious task. Automatic text classification helps to easily handle the large number of documents by organising them automatically into predefined classes. The effectiveness and efficiency of automatic text classification largely depends on the way text documents are represented. A text document is usually viewed as a bag of terms (or words) and represented as a vector using the vector space model where terms are assumed unordered and independent and term frequencies (or weights) are used in the representation. Graphs are another text representation scheme that considers the structure of terms in the text document which is important for natural language. Terms weighted on the basis of graph representation increase the performance of text classification. In this paper, we present a novel approach for graph-based supervised term weighting which considers information relevant for the classification task using node centrality in the co-occurrence graphs built from the labelled training documents. Our experimental evaluation of the proposed term weighting scheme on four benchmark datasets shows the scheme has consistently superior performance over the state-of-The-Art term weighting methods for text classification.

Original languageEnglish
Title of host publicationProceedings - 16th IEEE International Conference on Data Mining Workshops, ICDMW 2016
EditorsCarlotta Domeniconi, Francesco Gullo, Francesco Bonchi, Francesco Bonchi, Josep Domingo-Ferrer, Ricardo Baeza-Yates, Ricardo Baeza-Yates, Ricardo Baeza-Yates, Zhi-Hua Zhou, Xindong Wu
PublisherIEEE Computer Society
Pages1261-1268
Number of pages8
ISBN (Electronic)9781509054725
DOIs
Publication statusPublished - 02 Feb 2017
Externally publishedYes
Event16th IEEE International Conference on Data Mining Workshops, ICDMW 2016 - Barcelona, Spain
Duration: 12 Dec 201615 Dec 2016

Publication series

NameIEEE International Conference on Data Mining Workshops, ICDMW
Volume0
ISSN (Print)2375-9232
ISSN (Electronic)2375-9259

Conference

Conference16th IEEE International Conference on Data Mining Workshops, ICDMW 2016
Country/TerritorySpain
CityBarcelona
Period12/12/201615/12/2016

Bibliographical note

Publisher Copyright:
© 2016 IEEE.

Copyright:
Copyright 2020 Elsevier B.V., All rights reserved.

Keywords

  • Automatic text classification
  • Graph-based text representation
  • Node centrality
  • Supervised term weighting

ASJC Scopus subject areas

  • Computer Science Applications
  • Software

Fingerprint

Dive into the research topics of 'Centrality-Based Approach for Supervised Term Weighting'. Together they form a unique fingerprint.

Cite this