A Statistical Decision Tree Algorithm for Data Stream Classification

Mirela Teixeira Cazzolato, Marcela Xavier Ribeiro, Cristiane Yaguinuma, Marilde Terezinha Prado Santos

2013

Abstract

A large amount of data is generated daily. Credit card transactions, monitoring networks, sensors and telecommunications are some examples among many applications that generate large volumes of data in an automated way. Data streams storage and knowledge extraction techniques differ from those used on traditional data. In the context of data stream classification many incremental techniques has been proposed. In this paper we present an incremental decision tree algorithm called StARMiner Tree (ST), which is based on Very Fast Decision Tree (VFDT) system, which deals with numerical data and uses a method based on statistics as a heuristic to decide when to split a node and also to choose the best attribute to be used in the test at a node. We applied ST in four datasets, two synthetic and two real-world, comparing its performance to the VFDT. In all experiments ST achieved a better accuracy, dealing well with noise data and describing well the data from the earliest examples. However, in three of four experiments ST created a bigger tree. The obtained results indicate that ST is a good classifier using large and smaller datasets, maintaining good accuracy and execution time.

Download


Paper Citation


in Harvard Style

Teixeira Cazzolato M., Xavier Ribeiro M., Yaguinuma C. and Terezinha Prado Santos M. (2013). A Statistical Decision Tree Algorithm for Data Stream Classification . In Proceedings of the 15th International Conference on Enterprise Information Systems - Volume 1: ICEIS, ISBN 978-989-8565-59-4, pages 217-223. DOI: 10.5220/0004447202170223

in Bibtex Style

@conference{iceis13,
author={Mirela Teixeira Cazzolato and Marcela Xavier Ribeiro and Cristiane Yaguinuma and Marilde Terezinha Prado Santos},
title={A Statistical Decision Tree Algorithm for Data Stream Classification},
booktitle={Proceedings of the 15th International Conference on Enterprise Information Systems - Volume 1: ICEIS,},
year={2013},
pages={217-223},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004447202170223},
isbn={978-989-8565-59-4},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 15th International Conference on Enterprise Information Systems - Volume 1: ICEIS,
TI - A Statistical Decision Tree Algorithm for Data Stream Classification
SN - 978-989-8565-59-4
AU - Teixeira Cazzolato M.
AU - Xavier Ribeiro M.
AU - Yaguinuma C.
AU - Terezinha Prado Santos M.
PY - 2013
SP - 217
EP - 223
DO - 10.5220/0004447202170223