Modelspace - Cooperative Document Information Extraction in Flexible Hierarchies

Daniel Schuster, Daniel Esser, Klemens Muthmann, Alexander Schill

2015

Abstract

Business document indexing for ordered filing of documents is a crucial task for every company. Since this is a tedious error prone work, automatic or at least semi-automatic approaches have a high value. One approach for semi-automated indexing of business documents uses self-learning information extraction methods based on user feedback. While these methods require no management of complex indexing rules, learning by user feedback requires each user to first provide a number of correct extractions before getting appropriate automatic results. To eliminate this cold start problem we propose a cooperative approach to document information extraction involving dynamic hierarchies of extraction services. We provide strategies for making the decision when to contact another information extraction service within the hierarchy, methods to combine results from different sources, as well as aging and split strategies to reduce the size of cooperatively used indexes. An evaluation with a large number of real-world business documents shows the benefits of our approach.

Download


Paper Citation


in Harvard Style

Schuster D., Esser D., Muthmann K. and Schill A. (2015). Modelspace - Cooperative Document Information Extraction in Flexible Hierarchies . In Proceedings of the 17th International Conference on Enterprise Information Systems - Volume 1: ICEIS, ISBN 978-989-758-096-3, pages 321-329. DOI: 10.5220/0005376403210329

in Bibtex Style

@conference{iceis15,
author={Daniel Schuster and Daniel Esser and Klemens Muthmann and Alexander Schill},
title={Modelspace - Cooperative Document Information Extraction in Flexible Hierarchies},
booktitle={Proceedings of the 17th International Conference on Enterprise Information Systems - Volume 1: ICEIS,},
year={2015},
pages={321-329},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005376403210329},
isbn={978-989-758-096-3},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 17th International Conference on Enterprise Information Systems - Volume 1: ICEIS,
TI - Modelspace - Cooperative Document Information Extraction in Flexible Hierarchies
SN - 978-989-758-096-3
AU - Schuster D.
AU - Esser D.
AU - Muthmann K.
AU - Schill A.
PY - 2015
SP - 321
EP - 329
DO - 10.5220/0005376403210329