A FRAMEWORK FOR DATA CLEANING IN DATA WAREHOUSES

Taoxin Peng

2008

Abstract

It is a persistent challenge to achieve a high quality of data in data warehouses. Data cleaning is a crucial task for such a challenge. To deal with this challenge, a set of methods and tools has been developed. However, there are still at least two questions needed to be answered: How to improve the efficiency while performing data cleaning? How to improve the degree of automation when performing data cleaning? This paper challenges these two questions by presenting a novel framework, which provides an approach to managing data cleaning in data warehouses by focusing on the use of data quality dimensions, and decoupling a cleaning process into several sub-processes. Initial test run of the processes in the framework demonstrates that the approach presented is efficient and scalable for data cleaning in data warehouses.

Download


Paper Citation


in Harvard Style

Peng T. (2008). A FRAMEWORK FOR DATA CLEANING IN DATA WAREHOUSES . In Proceedings of the Tenth International Conference on Enterprise Information Systems - Volume 1: ICEIS, ISBN 978-989-8111-36-4, pages 473-478. DOI: 10.5220/0001706004730478

in Bibtex Style

@conference{iceis08,
author={Taoxin Peng},
title={A FRAMEWORK FOR DATA CLEANING IN DATA WAREHOUSES},
booktitle={Proceedings of the Tenth International Conference on Enterprise Information Systems - Volume 1: ICEIS,},
year={2008},
pages={473-478},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001706004730478},
isbn={978-989-8111-36-4},
}


in EndNote Style

TY - CONF
JO - Proceedings of the Tenth International Conference on Enterprise Information Systems - Volume 1: ICEIS,
TI - A FRAMEWORK FOR DATA CLEANING IN DATA WAREHOUSES
SN - 978-989-8111-36-4
AU - Peng T.
PY - 2008
SP - 473
EP - 478
DO - 10.5220/0001706004730478