Abstract
Neural attention based sequence to sequence (seq2seq) network models have achieved remarkable performance on NLP tasks such as image caption generation, paraphrase generation, and machine translation. The underlying framework for these models is usually a deep neural architecture comprising of multi-layer encoder-decoder sub-networks. The performance of the decoding sub-network is greatly affected by how well it extracts the relevant source-side contextual information. Conventional approaches only consider the outputs of the last encoding layer when computing the source contexts via a neural attention mechanism. Due to the nature of information flow across the time-steps within each encoder layer as well flow from layer to layer, there is no guarantee that the necessary information required to build the source context is stored in the final encoding layer. These approaches also do not fully capture the structural composition of natural language. To address these limitations, this paper presents several new strategies to generating the contextual feature vector jointly across all the encoding layers. The proposed strategies consistently outperform the conventional approaches to performing the neural attention computation on the task of paraphrase generation.
Original language | English |
---|---|
Title of host publication | Natural Language Processing and Information Systems - 24th International Conference on Applications of Natural Language to Information Systems, NLDB 2019, Proceedings |
Editors | Elisabeth Métais, Farid Meziane, Sunil Vadera, Vijayan Sugumaran, Mohamad Saraee |
Publisher | Springer Verlag |
Pages | 92-104 |
Number of pages | 13 |
ISBN (Print) | 9783030232801 |
DOIs | |
Publication status | Published - 2019 |
Event | 24th International Conference on Application of Natural Language to Information Systems, NLDB 2019 - Salford, United Kingdom Duration: 26 Jun 2019 → 28 Jun 2019 |
Publication series
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Volume | 11608 LNCS |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference
Conference | 24th International Conference on Application of Natural Language to Information Systems, NLDB 2019 |
---|---|
Country/Territory | United Kingdom |
City | Salford |
Period | 26/06/2019 → 28/06/2019 |
Bibliographical note
Publisher Copyright:© 2019, Springer Nature Switzerland AG.
Copyright:
Copyright 2019 Elsevier B.V., All rights reserved.
Keywords
- Multi-layer encoder-decoder
- Neural attention
- Source context
ASJC Scopus subject areas
- Theoretical Computer Science
- General Computer Science