MixKMeans: Clustering Question-Answer Archives

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Citations (Scopus)
408 Downloads (Pure)

Abstract

Community-driven Question Answering (CQA) systems that crowdsource experiential information in the form of questions and answers and have accumulated valuable reusable knowledge. Clustering of QA datasets from CQA systems provides a means of organizing the content to ease tasks such as manual curation and tagging. In this paper, we present a clustering method that exploits the two-part question-answer structure in QA datasets to improve clustering quality. Our method, {\it MixKMeans}, composes question and answer space similarities in a way that the space on which the match is higher is allowed to dominate. This construction is motivated by our observation that semantic similarity between question-answer data (QAs) could get localized in either space. We empirically evaluate our method on a variety of real-world labeled datasets. Our results indicate that our method significantly outperforms state-of-the-art clustering methods for the task of clustering question-answer archives.
Original languageEnglish
Title of host publicationProceedings of the Conference on Empirical Methods in Natural Language Processing 2016
PublisherAssociation for Computing Machinery (ACM)
Publication statusPublished - 06 Nov 2016
EventConference on Empirical Methods in Natural Language Processing - Texas, Austin, United States
Duration: 02 Nov 201606 Nov 2016
http://www.emnlp2016.net/

Conference

ConferenceConference on Empirical Methods in Natural Language Processing
Abbreviated titleEMNLP 2016
CountryUnited States
CityAustin
Period02/11/201606/11/2016
Internet address

Fingerprint Dive into the research topics of 'MixKMeans: Clustering Question-Answer Archives'. Together they form a unique fingerprint.

  • Cite this

    Padmanabhan, D. (2016). MixKMeans: Clustering Question-Answer Archives. In Proceedings of the Conference on Empirical Methods in Natural Language Processing 2016 Association for Computing Machinery (ACM).