Modelspace - Cooperative Document Information Extraction in Flexible Hierarchies
Daniel Schuster, Daniel Esser, Klemens Muthmann, Alexander Schill
2015
Abstract
Business document indexing for ordered filing of documents is a crucial task for every company. Since this is a tedious error prone work, automatic or at least semi-automatic approaches have a high value. One approach for semi-automated indexing of business documents uses self-learning information extraction methods based on user feedback. While these methods require no management of complex indexing rules, learning by user feedback requires each user to first provide a number of correct extractions before getting appropriate automatic results. To eliminate this cold start problem we propose a cooperative approach to document information extraction involving dynamic hierarchies of extraction services. We provide strategies for making the decision when to contact another information extraction service within the hierarchy, methods to combine results from different sources, as well as aging and split strategies to reduce the size of cooperatively used indexes. An evaluation with a large number of real-world business documents shows the benefits of our approach.
DownloadPaper Citation
in Harvard Style
Schuster D., Esser D., Muthmann K. and Schill A. (2015). Modelspace - Cooperative Document Information Extraction in Flexible Hierarchies . In Proceedings of the 17th International Conference on Enterprise Information Systems - Volume 1: ICEIS, ISBN 978-989-758-096-3, pages 321-329. DOI: 10.5220/0005376403210329
in Bibtex Style
@conference{iceis15,
author={Daniel Schuster and Daniel Esser and Klemens Muthmann and Alexander Schill},
title={Modelspace - Cooperative Document Information Extraction in Flexible Hierarchies},
booktitle={Proceedings of the 17th International Conference on Enterprise Information Systems - Volume 1: ICEIS,},
year={2015},
pages={321-329},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005376403210329},
isbn={978-989-758-096-3},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 17th International Conference on Enterprise Information Systems - Volume 1: ICEIS,
TI - Modelspace - Cooperative Document Information Extraction in Flexible Hierarchies
SN - 978-989-758-096-3
AU - Schuster D.
AU - Esser D.
AU - Muthmann K.
AU - Schill A.
PY - 2015
SP - 321
EP - 329
DO - 10.5220/0005376403210329