Investigating terminology translation in statistical and neural machine translation: a case study on English-to-Hindi and Hindi-to-English

Rejwanul Haque, Mohammed Hasanuzzaman, Andy Way

Research output: Chapter in Book/Report/Conference proceedingConference contribution

8 Citations (Scopus)

Abstract

Terminology translation plays a critical role in domain-specific machine translation (MT). In this paper, we conduct a comparative qualitative evaluation on terminology translation in phrase-based statistical MT (PB-SMT) and neural MT (NMT) in two translation directions: English-to-Hindi and Hindi-to-English. For this, we select a test set from a legal domain corpus and create a gold standard for evaluating terminology translation in MT. We also propose an error typology taking the terminology translation errors into consideration. We evaluate the MT systems' performance on terminology translation, and demonstrate our findings, unraveling strengths, weaknesses, and similarities of PB-SMT and NMT in the area of term translation.

Original languageEnglish
Title of host publicationProceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019)
EditorsRuslan Mitkov, Galia Angelova, Ivelina Nikolova, Irina Temnikova, Irina Temnikova
PublisherIncoma Ltd
Pages437-446
Number of pages10
ISBN (Electronic)9789544520557
Publication statusPublished - 01 Sept 2019
Externally publishedYes
Event12th International Conference on Recent Advances in Natural Language Processing, RANLP 2019 - Varna, Bulgaria
Duration: 02 Sept 201904 Sept 2019

Publication series

NameInternational Conference Recent Advances in Natural Language Processing, RANLP
Volume2019-September
ISSN (Print)1313-8502
ISSN (Electronic)2603-2813

Conference

Conference12th International Conference on Recent Advances in Natural Language Processing, RANLP 2019
Country/TerritoryBulgaria
CityVarna
Period02/09/201904/09/2019

Bibliographical note

Funding Information:
The ADAPT Centre for Digital Content Technology is funded under the Science Foundation Ireland (SFI) Research Centres Programme (Grant No. 13/RC/2106) and is co-funded under the European Regional Development Fund. This project has partially received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No. 713567, and the publication has emanated from research supported in part by a research grant from SFI under Grant Number 13/RC/2077.

Funding Information:
The ADAPT Centre for Digital Content Technology is funded under the Science Foundation Ireland (SFI) Research Centres Programme (Grant No. 13/RC/2106) and is co-funded under the European Regional Development Fund. This project has partially received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Sk?odowska-Curie grant agreement No. 713567, and the publication has emanated from research supported in part by a research grant from SFI under Grant Number 13/RC/2077.

Publisher Copyright:
© 2019 Association for Computational Linguistics (ACL). All rights reserved.

ASJC Scopus subject areas

  • Software
  • Computer Science Applications
  • Artificial Intelligence
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Investigating terminology translation in statistical and neural machine translation: a case study on English-to-Hindi and Hindi-to-English'. Together they form a unique fingerprint.

Cite this