Abstract
Terminology translation plays a critical role in domain-specific machine translation (MT). In this paper, we conduct a comparative qualitative evaluation on terminology translation in phrase-based statistical MT (PB-SMT) and neural MT (NMT) in two translation directions: English-to-Hindi and Hindi-to-English. For this, we select a test set from a legal domain corpus and create a gold standard for evaluating terminology translation in MT. We also propose an error typology taking the terminology translation errors into consideration. We evaluate the MT systems' performance on terminology translation, and demonstrate our findings, unraveling strengths, weaknesses, and similarities of PB-SMT and NMT in the area of term translation.
Original language | English |
---|---|
Title of host publication | Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019) |
Editors | Ruslan Mitkov, Galia Angelova, Ivelina Nikolova, Irina Temnikova, Irina Temnikova |
Publisher | Incoma Ltd |
Pages | 437-446 |
Number of pages | 10 |
ISBN (Electronic) | 9789544520557 |
Publication status | Published - 01 Sept 2019 |
Externally published | Yes |
Event | 12th International Conference on Recent Advances in Natural Language Processing, RANLP 2019 - Varna, Bulgaria Duration: 02 Sept 2019 → 04 Sept 2019 |
Publication series
Name | International Conference Recent Advances in Natural Language Processing, RANLP |
---|---|
Volume | 2019-September |
ISSN (Print) | 1313-8502 |
ISSN (Electronic) | 2603-2813 |
Conference
Conference | 12th International Conference on Recent Advances in Natural Language Processing, RANLP 2019 |
---|---|
Country/Territory | Bulgaria |
City | Varna |
Period | 02/09/2019 → 04/09/2019 |
Bibliographical note
Funding Information:The ADAPT Centre for Digital Content Technology is funded under the Science Foundation Ireland (SFI) Research Centres Programme (Grant No. 13/RC/2106) and is co-funded under the European Regional Development Fund. This project has partially received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No. 713567, and the publication has emanated from research supported in part by a research grant from SFI under Grant Number 13/RC/2077.
Funding Information:
The ADAPT Centre for Digital Content Technology is funded under the Science Foundation Ireland (SFI) Research Centres Programme (Grant No. 13/RC/2106) and is co-funded under the European Regional Development Fund. This project has partially received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Sk?odowska-Curie grant agreement No. 713567, and the publication has emanated from research supported in part by a research grant from SFI under Grant Number 13/RC/2077.
Publisher Copyright:
© 2019 Association for Computational Linguistics (ACL). All rights reserved.
ASJC Scopus subject areas
- Software
- Computer Science Applications
- Artificial Intelligence
- Electrical and Electronic Engineering