Anytime OPTICS: An Efficient Approach for Hierarchical Density-Based Clustering

Thai Son Mai, Ira Assent, Anh Le

Research output: Chapter in Book/Report/Conference proceedingConference contribution

7 Citations (Scopus)

Abstract

OPTICS is a fundamental data clustering technique that has been widely applied in many fields. However, it suffers from performance degradation when faced with large datasets and expensive distance measures because of its quadratic complexity in terms of both time and distance function calls. In this paper, we introduce a novel anytime approach to tackle the above problems. The general idea is to use a sequence of lower-bounding (LB) distances of the true distance measure to produce multiple approximations of the true reachability plot of OPTICS. The algorithm quickly produces an approximation result using the first LB distance. It then continuously refines the results with subsequent LB distances and the results from the previous computations. At any time, users can suspend and resume the algorithm to examine the results, enabling them to stop the algorithm whenever they are satisfied with the obtained results, thereby saving computational cost. Our proposed algorithms, called Any-OPTICS and Any-OPTICS-XS, are built upon this anytime scheme and can be applied for many complex datasets. Our experiments show that Any-OPTICS obtains very good clustering results at early stages of execution, leading to orders of magnitudes speed up. Even when run to the final distance measure, the cumulative runtime of Any-OPTICS is faster than OPTICS and its extensions.
Original languageEnglish
Title of host publicationInternational Conference on Database Systems for Advanced Applications (DASFAA)
PublisherSpringer
Pages164-179
Volume9642
DOIs
Publication statusPublished - 25 Mar 2016
Externally publishedYes

Publication series

NameLecture Notes in Computer Science
PublisherSpringer
ISSN (Electronic)0302-9743

Fingerprint Dive into the research topics of 'Anytime OPTICS: An Efficient Approach for Hierarchical Density-Based Clustering'. Together they form a unique fingerprint.

Cite this