Retrieving Similar Discussion Forum Threads: A Structure based Approach

Amit Singh, Deepak Padmanabhan, Dinesh Raghu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

15 Citations (Scopus)

Abstract

Online forums are becoming a popular way of finding useful
information on the web. Search over forums for existing discussion
threads so far is limited to keyword-based search due
to the minimal effort required on part of the users. However,
it is often not possible to capture all the relevant context in a
complex query using a small number of keywords. Examplebased
search that retrieves similar discussion threads given
one exemplary thread is an alternate approach that can help
the user provide richer context and vastly improve forum
search results. In this paper, we address the problem of
finding similar threads to a given thread. Towards this, we
propose a novel methodology to estimate similarity between
discussion threads. Our method exploits the thread structure
to decompose threads in to set of weighted overlapping
components. It then estimates pairwise thread similarities
by quantifying how well the information in the threads are
mutually contained within each other using lexical similarities
between their underlying components. We compare our
proposed methods on real datasets against state-of-the-art
thread retrieval mechanisms wherein we illustrate that our
techniques outperform others by large margins on popular
retrieval evaluation measures such as NDCG, MAP, Precision@k
and MRR. In particular, consistent improvements of
up to 10% are observed on all evaluation measures
LanguageEnglish
Title of host publicationThe 35th International ACM SIGIR conference on research and development in Information Retrieval, SIGIR '12, Portland, OR, USA, August 12-16, 2012.
Pages135-144
Number of pages10
Publication statusPublished - 2012
EventSIGIR 2012 - Oregon, Portland, United States
Duration: 12 Aug 201216 Aug 2012

Conference

ConferenceSIGIR 2012
CountryUnited States
CityPortland
Period12/08/201216/08/2012

Cite this

Singh, A., Padmanabhan, D., & Raghu, D. (2012). Retrieving Similar Discussion Forum Threads: A Structure based Approach. In The 35th International ACM SIGIR conference on research and development in Information Retrieval, SIGIR '12, Portland, OR, USA, August 12-16, 2012. (pp. 135-144)
Singh, Amit ; Padmanabhan, Deepak ; Raghu, Dinesh. / Retrieving Similar Discussion Forum Threads: A Structure based Approach. The 35th International ACM SIGIR conference on research and development in Information Retrieval, SIGIR '12, Portland, OR, USA, August 12-16, 2012.. 2012. pp. 135-144
@inproceedings{fe0f8ecc0aac4f9b9c90d4a0399bcec9,
title = "Retrieving Similar Discussion Forum Threads: A Structure based Approach",
abstract = "Online forums are becoming a popular way of finding usefulinformation on the web. Search over forums for existing discussionthreads so far is limited to keyword-based search dueto the minimal effort required on part of the users. However,it is often not possible to capture all the relevant context in acomplex query using a small number of keywords. Examplebasedsearch that retrieves similar discussion threads givenone exemplary thread is an alternate approach that can helpthe user provide richer context and vastly improve forumsearch results. In this paper, we address the problem offinding similar threads to a given thread. Towards this, wepropose a novel methodology to estimate similarity betweendiscussion threads. Our method exploits the thread structureto decompose threads in to set of weighted overlappingcomponents. It then estimates pairwise thread similaritiesby quantifying how well the information in the threads aremutually contained within each other using lexical similaritiesbetween their underlying components. We compare ourproposed methods on real datasets against state-of-the-artthread retrieval mechanisms wherein we illustrate that ourtechniques outperform others by large margins on popularretrieval evaluation measures such as NDCG, MAP, Precision@kand MRR. In particular, consistent improvements ofup to 10{\%} are observed on all evaluation measures",
author = "Amit Singh and Deepak Padmanabhan and Dinesh Raghu",
year = "2012",
language = "English",
pages = "135--144",
booktitle = "The 35th International ACM SIGIR conference on research and development in Information Retrieval, SIGIR '12, Portland, OR, USA, August 12-16, 2012.",

}

Singh, A, Padmanabhan, D & Raghu, D 2012, Retrieving Similar Discussion Forum Threads: A Structure based Approach. in The 35th International ACM SIGIR conference on research and development in Information Retrieval, SIGIR '12, Portland, OR, USA, August 12-16, 2012.. pp. 135-144, SIGIR 2012, Portland, United States, 12/08/2012.

Retrieving Similar Discussion Forum Threads: A Structure based Approach. / Singh, Amit; Padmanabhan, Deepak; Raghu, Dinesh.

The 35th International ACM SIGIR conference on research and development in Information Retrieval, SIGIR '12, Portland, OR, USA, August 12-16, 2012.. 2012. p. 135-144.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Retrieving Similar Discussion Forum Threads: A Structure based Approach

AU - Singh, Amit

AU - Padmanabhan, Deepak

AU - Raghu, Dinesh

PY - 2012

Y1 - 2012

N2 - Online forums are becoming a popular way of finding usefulinformation on the web. Search over forums for existing discussionthreads so far is limited to keyword-based search dueto the minimal effort required on part of the users. However,it is often not possible to capture all the relevant context in acomplex query using a small number of keywords. Examplebasedsearch that retrieves similar discussion threads givenone exemplary thread is an alternate approach that can helpthe user provide richer context and vastly improve forumsearch results. In this paper, we address the problem offinding similar threads to a given thread. Towards this, wepropose a novel methodology to estimate similarity betweendiscussion threads. Our method exploits the thread structureto decompose threads in to set of weighted overlappingcomponents. It then estimates pairwise thread similaritiesby quantifying how well the information in the threads aremutually contained within each other using lexical similaritiesbetween their underlying components. We compare ourproposed methods on real datasets against state-of-the-artthread retrieval mechanisms wherein we illustrate that ourtechniques outperform others by large margins on popularretrieval evaluation measures such as NDCG, MAP, Precision@kand MRR. In particular, consistent improvements ofup to 10% are observed on all evaluation measures

AB - Online forums are becoming a popular way of finding usefulinformation on the web. Search over forums for existing discussionthreads so far is limited to keyword-based search dueto the minimal effort required on part of the users. However,it is often not possible to capture all the relevant context in acomplex query using a small number of keywords. Examplebasedsearch that retrieves similar discussion threads givenone exemplary thread is an alternate approach that can helpthe user provide richer context and vastly improve forumsearch results. In this paper, we address the problem offinding similar threads to a given thread. Towards this, wepropose a novel methodology to estimate similarity betweendiscussion threads. Our method exploits the thread structureto decompose threads in to set of weighted overlappingcomponents. It then estimates pairwise thread similaritiesby quantifying how well the information in the threads aremutually contained within each other using lexical similaritiesbetween their underlying components. We compare ourproposed methods on real datasets against state-of-the-artthread retrieval mechanisms wherein we illustrate that ourtechniques outperform others by large margins on popularretrieval evaluation measures such as NDCG, MAP, Precision@kand MRR. In particular, consistent improvements ofup to 10% are observed on all evaluation measures

M3 - Conference contribution

SP - 135

EP - 144

BT - The 35th International ACM SIGIR conference on research and development in Information Retrieval, SIGIR '12, Portland, OR, USA, August 12-16, 2012.

ER -

Singh A, Padmanabhan D, Raghu D. Retrieving Similar Discussion Forum Threads: A Structure based Approach. In The 35th International ACM SIGIR conference on research and development in Information Retrieval, SIGIR '12, Portland, OR, USA, August 12-16, 2012.. 2012. p. 135-144