Interactive Exploration of Subspace Clusters for High Dimensional Data

Jesper Kristensen, Thai Son Mai, Ira Assent, Jon Jacobsen, Bay Vo, Anh Le

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

PreDeCon is a fundamental clustering algorithm for finding arbitrarily shaped clusters hidden in high-dimensional feature spaces of data, which is an important research topic and has many potential applications. However, it suffers from very high runtime as well as lack of interactions with users. Our algorithm, called AnyPDC, introduces a novel approach to cope with these problems by casting PreDeCon into an anytime algorithm. It quickly produces an approximate result and iteratively refines it toward the result of PreDeCon at the end. This scheme not only significantly speeds up the algorithm but also provides interactions with users during its execution. Experiments conducted on real large datasets show that AnyPDC acquires good approximate results very early, leading to an order of magnitude speedup factor compared to PreDeCon. More interestingly, while anytime techniques usually end up slower than batch ones, AnyPDC is faster than PreDeCon even if it run to the end.
Original languageEnglish
Title of host publicationInternational Conference on Database and Expert Systems Applications (DEXA)
Pages327-342
DOIs
Publication statusPublished - 2017
Externally publishedYes

Keywords

  • Subspace clustering
  • Interactive clustering

Fingerprint

Dive into the research topics of 'Interactive Exploration of Subspace Clusters for High Dimensional Data'. Together they form a unique fingerprint.

Cite this