Privacy-preserving similarity-based text retrieval

Hweehwa Pang*, Jialie Shen, Ramayya Krishnan

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

45 Citations (Scopus)


Users of online services are increasingly wary that their activities could disclose confidential information on their business or personal activities. It would be desirable for an online document service to perform text retrieval for users, while protecting the privacy of their activities. In this article, we introduce a privacy-preserving, similarity-based text retrieval scheme that (a) prevents the server from accurately reconstructing the term composition of queries and documents, and (b) anonymizes the search results from unauthorized observers. At the same time, our scheme preserves the relevance-ranking of the search server, and enables accounting of the number of documents that each user opens. The effectiveness of the scheme is verified empirically with two real text corpora.

Original languageEnglish
Article number4
JournalACM Transactions on Internet Technology
Issue number1
Publication statusPublished - 01 Feb 2010
Externally publishedYes


  • Privacy of search queries
  • Security in text retrieval
  • Singular value decomposition

ASJC Scopus subject areas

  • Computer Networks and Communications

Fingerprint Dive into the research topics of 'Privacy-preserving similarity-based text retrieval'. Together they form a unique fingerprint.

Cite this