Dating texts by multi-class classification with sliding time intervals

Gregory Toner, Xiwu Han

Research output: Contribution to conferencePaperpeer-review

1 Citation (Scopus)

Abstract

We propose a practical method to date texts by classification with sliding time intervals (STI). This further explores the advantage of multi-class text classification, while drawing upon temporal characteristics in the training corpus. Extensive experiments were made on English and medieval Irish texts. Results showed that our STI dating method significantly outperformed classifiers with fixed time intervals (FTI). The Naïve Bayes Multinomial (NBM) with STI achieved the state-of-the-art dating precision on DTE Subtask 2 though only involving features of n-gram characters and words. Experiments on dating long documents and further analysis also indicated some promising points for further text dating research and other humanities fields.
Original languageEnglish
Pages1
Number of pages6
DOIs
Publication statusPublished - 27 Feb 2018
EventInternational Congress on Image and Signal Processing, BioMedical Engineering and Informatics - Shanghai, China
Duration: 14 Oct 201716 Oct 2017

Conference

ConferenceInternational Congress on Image and Signal Processing, BioMedical Engineering and Informatics
Abbreviated titleCISP-BMEI 2017
CountryChina
CityShanghai
Period14/10/201716/10/2017

Keywords

  • Bayes methods
  • Naïve Bayes Multinomial
  • sliding time intervals
  • medieval Irish
  • text dating
  • machine learning
  • annals

Fingerprint Dive into the research topics of 'Dating texts by multi-class classification with sliding time intervals'. Together they form a unique fingerprint.

Cite this