Abstract
PreDeCon is a fundamental clustering algorithm for finding arbitrarily shaped clusters hidden in high-dimensional feature spaces of data, which is an important research topic and has many potential applications. However, it suffers from very high runtime as well as lack of interactions with users. Our algorithm, called AnyPDC, introduces a novel approach to cope with these problems by casting PreDeCon into an anytime algorithm. It quickly produces an approximate result and iteratively refines it toward the result of PreDeCon at the end. This scheme not only significantly speeds up the algorithm but also provides interactions with users during its execution. Experiments conducted on real large datasets show that AnyPDC acquires good approximate results very early, leading to an order of magnitude speedup factor compared to PreDeCon. More interestingly, while anytime techniques usually end up slower than batch ones, AnyPDC is faster than PreDeCon even if it run to the end.
Original language | English |
---|---|
Title of host publication | International Conference on Database and Expert Systems Applications (DEXA) |
Pages | 327-342 |
DOIs | |
Publication status | Published - 2017 |
Externally published | Yes |
Keywords
- Subspace clustering
- Interactive clustering