Incremental Subsequence Clustering Algorithm from Multiple Data Streams

Zaher Al Aghbari, Zaher Al Aghbari, Ayoub Al Hamadi, Thar Baker

2021

Abstract

Clustering subsequences of continuous data streams have a wide range of applications, such as stock market data, social data, and wireless sensor data. Due to the continuous nature of data streams, finding evolving clusters is a challenging task. This paper proposes ISsC, which is an incremental clustering algorithm of subsequences in multiple data streams. The ISsC algorithm employs a window buffer to collect and process the continuous data. Clusters found in previous windows are kept in a global List. Then, the List of clusters is updated incrementally by clusters found in the current without the need to recompute the clusters from the entire historical streams. If the number of cluster members (e.g., subsequences) is above a certain threshold, the cluster is deemed a frequent subsequence. Old clusters are tracked through a decay parameter and removed from the global List once this parameter is decayed to a negative value. Extensive experiments are conducted on multiple data streams to show the feasibility of the ISsC algorithm.

Download


Paper Citation


in Harvard Style

Al Aghbari Z., Al Hamadi A. and Baker T. (2021). Incremental Subsequence Clustering Algorithm from Multiple Data Streams. In Proceedings of the 2nd International Conference on Big Data, Modelling and Machine Learning - Volume 1: BML, ISBN 978-989-758-559-3, pages 92-96. DOI: 10.5220/0010729000003101


in Bibtex Style

@conference{bml21,
author={Zaher Al Aghbari and Ayoub Al Hamadi and Thar Baker},
title={Incremental Subsequence Clustering Algorithm from Multiple Data Streams},
booktitle={Proceedings of the 2nd International Conference on Big Data, Modelling and Machine Learning - Volume 1: BML,},
year={2021},
pages={92-96},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010729000003101},
isbn={978-989-758-559-3},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 2nd International Conference on Big Data, Modelling and Machine Learning - Volume 1: BML,
TI - Incremental Subsequence Clustering Algorithm from Multiple Data Streams
SN - 978-989-758-559-3
AU - Al Aghbari Z.
AU - Al Hamadi A.
AU - Baker T.
PY - 2021
SP - 92
EP - 96
DO - 10.5220/0010729000003101