Measuring similarity for multidimensional sequences

Hui Wang, Zhiwei Lin, Sally McClean, Jun Liu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Multidimensional sequences are common, and measuring their similarity is a key to any analysis of such data. There is a wealth of similarity measures for sequences in the literature, but most of them are designed for a special type of sequence and later extended to more general types. These extensions are usually ad hoc, and the extended versions may lose the original conceptual interpretation of the measure. In this paper we consider the problem of how to measure similarity for the general type of multidimensional sequences effectively in a conceptually uniform way. We show that the subsequence concept behind longest common subsequence and all common subsequences can be extended from the temporal dimension to the spatial dimension, and we generalize the all common subsequences similarity to multidimensional sequences. The hard problem is how to compute the generalized similarity. We present a theorem that combines the temporal and spatial dimensions in a simple formula. This theorem suggests a dynamic programming algorithm to compute the generalized similarity. A preliminary experiment shows that this similarity produces competitive outcomes. However, this approach counts some subsequences multiple times when a sequence has repeated elements. We present a theorem that allows counting of distinct common subsequences.
Original languageEnglish
Title of host publicationProceedings - 10th IEEE International Conference on Data Mining Workshops, ICDMW 2010
Pages281-287
Number of pages7
DOIs
Publication statusPublished - 2010

Publication series

NameProceedings - IEEE International Conference on Data Mining, ICDM

Bibliographical note

10th IEEE International Conference on Data Mining Workshops, ICDMW 2010 ; Conference date: 14-12-2010 Through 17-12-2010

Keywords

  • All common subsequences
  • Dynamic time warping
  • Multidimensional sequences
  • Similarity
  • The longest common subsequence

Fingerprint

Dive into the research topics of 'Measuring similarity for multidimensional sequences'. Together they form a unique fingerprint.

Cite this