## Abstract

Association rule mining is an indispensable tool for discovering

insights from large databases and data warehouses.

The data in a warehouse being multi-dimensional, it is often

useful to mine rules over subsets of data defined by selections

over the dimensions. Such interactive rule mining

over multi-dimensional query windows is difficult since rule

mining is computationally expensive. Current methods using

pre-computation of frequent itemsets require counting

of some itemsets by revisiting the transaction database at

query time, which is very expensive. We develop a method

(RMW) that identifies the minimal set of itemsets to compute

and store for each cell, so that rule mining over any

query window may be performed without going back to the

transaction database. We give formal proofs that the set of

itemsets chosen by RMW is sufficient to answer any query

and also prove that it is the optimal set to be computed

for 1 dimensional queries. We demonstrate through an extensive

empirical evaluation that RMW achieves extremely

fast query response time compared to existing methods, with

only moderate overhead in pre-computation and storage

insights from large databases and data warehouses.

The data in a warehouse being multi-dimensional, it is often

useful to mine rules over subsets of data defined by selections

over the dimensions. Such interactive rule mining

over multi-dimensional query windows is difficult since rule

mining is computationally expensive. Current methods using

pre-computation of frequent itemsets require counting

of some itemsets by revisiting the transaction database at

query time, which is very expensive. We develop a method

(RMW) that identifies the minimal set of itemsets to compute

and store for each cell, so that rule mining over any

query window may be performed without going back to the

transaction database. We give formal proofs that the set of

itemsets chosen by RMW is sufficient to answer any query

and also prove that it is the optimal set to be computed

for 1 dimensional queries. We demonstrate through an extensive

empirical evaluation that RMW achieves extremely

fast query response time compared to existing methods, with

only moderate overhead in pre-computation and storage

Original language | English |
---|---|

Title of host publication | Proceedings of the Eleventh SIAM International Conference on Data Mining, SDM 2011 |

Pages | 582-593 |

Number of pages | 12 |

Publication status | Published - 2011 |

Event | SDM 2011 - Arizona, Phoenix, United States Duration: 28 Apr 2011 → 30 Apr 2011 |

### Conference

Conference | SDM 2011 |
---|---|

Country | United States |

City | Phoenix |

Period | 28/04/2011 → 30/04/2011 |