The problem of detecting spatially-coherent groups of data that exhibit anomalous behavior has started to attract attention due to applications across areas such as epidemic analysis and weather forecasting. Earlier efforts from the data mining community have largely focused on ﬁnding outliers, individual data objects that display deviant behavior. Such point-based methods are not easy to extend to ﬁnd groups of data that exhibit anomalous behavior. Scan Statistics are methods from the statistics community that have considered the problem of identifying regions where data objects exhibit a behavior that is atypical of the general dataset. The spatial scan statistic and methods that build upon it mostly adopt the framework of deﬁning a character for regions (e.g., circular or elliptical) of objects and repeatedly sampling regions of such character followed by applying a statistical test for anomaly detection. In the past decade, there have been efforts from the statistics community to enhance efﬁciency of scan statstics as well as to enable discovery of arbitrarily shaped anomalous regions. On the other hand, the data mining community has started to look at determining anomalous regions that have behavior divergent from their neighborhood.In this chapter,we survey the space of techniques for detecting anomalous regions on spatial data from across the data mining and statistics communities while outlining connections to well-studied problems in clustering and image segmentation. We analyze the techniques systematically by categorizing them appropriately to provide a structured birds eye view of the work on anomalous region detection;we hope that this would encourage better cross-pollination of ideas across communities to help advance the frontier in anomaly detection.
|Title of host publication||Unsupervised Learning Algorithms|
|Editors||M. Emre Celebri, Kemal Aydin|
|Publisher||Springer International Publishing Switzerland|
|Publication status||Accepted - 03 Feb 2016|