Assessing the Impact of Stemming Algorithms Applied to Judicial Jurisprudence - An Experimental Analysis

Robert A. N. de Oliveira, Methanias Colaço Júnior

2017

Abstract

Stemming algorithms are commonly used during textual preprocessing phase in order to reduce data dimensionality. However, this reduction presents different efficacy levels depending on the domain it is applied. Hence, this work is an experimental analysis about the dimensionality reduction by stemming a veracious base of judicial jurisprudence formed by four subsets of documents. With such document base, it is necessary to adopt techniques that increase the efficiency of storage and search for such information, otherwise there is a loss of both computing resources and access to justice, as stakeholders may not find the document they need to plead their rights. The results show that, depending on the algorithm and the collection, there may be a reduction of up to 52\% of these terms in the documents. Furthermore, we have found a strong correlation between the reduction percentage and the quantity of unique terms in the original document. This way, RSLP algorithm was the most effective in terms of dimensionality reduction, among the stemming algorithms analyzed, in the four collections studied and it was excelled when applied to judgments of Appeals Court.

Download


Paper Citation


in Harvard Style

N. de Oliveira R. and Colaço Júnior M. (2017). Assessing the Impact of Stemming Algorithms Applied to Judicial Jurisprudence - An Experimental Analysis . In Proceedings of the 19th International Conference on Enterprise Information Systems - Volume 1: ICEIS, ISBN 978-989-758-247-9, pages 99-105. DOI: 10.5220/0006317100990105

in Bibtex Style

@conference{iceis17,
author={Robert A. N. de Oliveira and Methanias Colaço Júnior},
title={Assessing the Impact of Stemming Algorithms Applied to Judicial Jurisprudence - An Experimental Analysis},
booktitle={Proceedings of the 19th International Conference on Enterprise Information Systems - Volume 1: ICEIS,},
year={2017},
pages={99-105},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006317100990105},
isbn={978-989-758-247-9},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 19th International Conference on Enterprise Information Systems - Volume 1: ICEIS,
TI - Assessing the Impact of Stemming Algorithms Applied to Judicial Jurisprudence - An Experimental Analysis
SN - 978-989-758-247-9
AU - N. de Oliveira R.
AU - Colaço Júnior M.
PY - 2017
SP - 99
EP - 105
DO - 10.5220/0006317100990105