A Study on the Relationship between Internal and External Validity Indices Applied to Partitioning and Density-based Clustering Algorithms

Caroline Tomasini, Eduardo N. Borges, Karina Machado, Leonardo Emmendorfer

2017

Abstract

Measuring the quality of data partitions is essential to the success of clustering applications. A lot of different validity indices have been proposed in the literature, but choosing the appropriate index for evaluating the results of a particular clustering algorithm remains a challenge. Clustering results can be evaluated using different indices based on external or internal criteria. An external criterion requires a partitioning of the data previously defined for comparison with the clustering results while an internal criterion evaluates clustering results considering only the data proprieties. This paper proposes a method that helps the user for selecting the most suitable cluster validity internal index applied on the results of partitioning and density-based clustering algorithms. We have looked into the relationships between internal and external indexes, relating them through linear regression and regression model trees. Each algorithm was run over synthetic datasets generated for this purpose, using different configurations. Experiments results point out that \textit{Silhouette} and \textit{Gamma} are the most suitable indices for evaluating both the datasets with compactness propriety and the datasets with multiple density.

Download


Paper Citation


in Harvard Style

Tomasini C., N. Borges E., Machado K. and Emmendorfer L. (2017). A Study on the Relationship between Internal and External Validity Indices Applied to Partitioning and Density-based Clustering Algorithms . In Proceedings of the 19th International Conference on Enterprise Information Systems - Volume 1: ICEIS, ISBN 978-989-758-247-9, pages 89-98. DOI: 10.5220/0006317000890098

in Bibtex Style

@conference{iceis17,
author={Caroline Tomasini and Eduardo N. Borges and Karina Machado and Leonardo Emmendorfer},
title={A Study on the Relationship between Internal and External Validity Indices Applied to Partitioning and Density-based Clustering Algorithms},
booktitle={Proceedings of the 19th International Conference on Enterprise Information Systems - Volume 1: ICEIS,},
year={2017},
pages={89-98},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006317000890098},
isbn={978-989-758-247-9},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 19th International Conference on Enterprise Information Systems - Volume 1: ICEIS,
TI - A Study on the Relationship between Internal and External Validity Indices Applied to Partitioning and Density-based Clustering Algorithms
SN - 978-989-758-247-9
AU - Tomasini C.
AU - N. Borges E.
AU - Machado K.
AU - Emmendorfer L.
PY - 2017
SP - 89
EP - 98
DO - 10.5220/0006317000890098