HIERARCHICAL MODEL-BASED CLUSTERING FOR RELATIONAL DATA

Jianzhong Chen, Mary Shapcott, Sally McClean, Kenny Adamson

2004

Abstract

Relational data mining deals with datasets containing multiple types of objects and relationships that are presented in relational formats, e.g. relational databases that have multiple tables. This paper proposes a propositional hierarchical model-based method for clustering relational data. We first define an object-relational star schema to model composite objects, and present a method of flattening composite objects into aggregate objects by introducing a new type of aggregates – frequency aggregate, which can be used to record not only the observed values but also the distribution of the values of an attribute. A hierarchical agglomerative clustering algorithm with log-likelihood distance is then applied to cluster the aggregated data tentatively. After stopping at a coarse estimate of the number of clusters, a mixture model-based method with the EM algorithm is developed to perform a further relocation clustering, in which Bayes Information Criterion is used to determine the optimal number of clusters. Finally we evaluate our approach on a real-world dataset.

Download


Paper Citation


in Harvard Style

Chen J., Shapcott M., McClean S. and Adamson K. (2004). HIERARCHICAL MODEL-BASED CLUSTERING FOR RELATIONAL DATA . In Proceedings of the Sixth International Conference on Enterprise Information Systems - Volume 2: ICEIS, ISBN 972-8865-00-7, pages 92-97. DOI: 10.5220/0002624300920097

in Bibtex Style

@conference{iceis04,
author={Jianzhong Chen and Mary Shapcott and Sally McClean and Kenny Adamson},
title={HIERARCHICAL MODEL-BASED CLUSTERING FOR RELATIONAL DATA},
booktitle={Proceedings of the Sixth International Conference on Enterprise Information Systems - Volume 2: ICEIS,},
year={2004},
pages={92-97},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002624300920097},
isbn={972-8865-00-7},
}


in EndNote Style

TY - CONF
JO - Proceedings of the Sixth International Conference on Enterprise Information Systems - Volume 2: ICEIS,
TI - HIERARCHICAL MODEL-BASED CLUSTERING FOR RELATIONAL DATA
SN - 972-8865-00-7
AU - Chen J.
AU - Shapcott M.
AU - McClean S.
AU - Adamson K.
PY - 2004
SP - 92
EP - 97
DO - 10.5220/0002624300920097