Retrieving Similar Discussion Forum Threads: A Structure based Approach

Amit Singh, Deepak Padmanabhan, Dinesh Raghu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

21 Citations (Scopus)


Online forums are becoming a popular way of finding useful
information on the web. Search over forums for existing discussion
threads so far is limited to keyword-based search due
to the minimal effort required on part of the users. However,
it is often not possible to capture all the relevant context in a
complex query using a small number of keywords. Examplebased
search that retrieves similar discussion threads given
one exemplary thread is an alternate approach that can help
the user provide richer context and vastly improve forum
search results. In this paper, we address the problem of
finding similar threads to a given thread. Towards this, we
propose a novel methodology to estimate similarity between
discussion threads. Our method exploits the thread structure
to decompose threads in to set of weighted overlapping
components. It then estimates pairwise thread similarities
by quantifying how well the information in the threads are
mutually contained within each other using lexical similarities
between their underlying components. We compare our
proposed methods on real datasets against state-of-the-art
thread retrieval mechanisms wherein we illustrate that our
techniques outperform others by large margins on popular
retrieval evaluation measures such as NDCG, MAP, Precision@k
and MRR. In particular, consistent improvements of
up to 10% are observed on all evaluation measures
Original languageEnglish
Title of host publicationThe 35th International ACM SIGIR conference on research and development in Information Retrieval, SIGIR '12, Portland, OR, USA, August 12-16, 2012.
Number of pages10
Publication statusPublished - 2012
EventSIGIR 2012 - Oregon, Portland, United States
Duration: 12 Aug 201216 Aug 2012


ConferenceSIGIR 2012
Country/TerritoryUnited States


Dive into the research topics of 'Retrieving Similar Discussion Forum Threads: A Structure based Approach'. Together they form a unique fingerprint.

Cite this