7 Discussion
This paper presents the first steps given to develop an IE system intended to extract
structured information from medical reports written in Portuguese. Our first evaluation
focused on NER. This evaluation is described, the results are analyzed and some experi-
ences that demonstrate the potential of this kind of techniques are presented. However,
it should be noticed that the evaluation results are preliminary and we expect to improve
with further development.
The results presented and the progress so far provides convincing grounds for be-
lieving that IE techniques will deliver effective ways for the extraction of information
from unstructured text sources, in particular, in the medical domain.
Acknowledgments
The author would like to thank the Neurophysiology department of the HGSA, Porto,
for the anonymized database access.
References
1. Pazienza M.T.: Information Extraction (a multidisciplinary approach to an emerging infor-
mation technology). Lecture Notes in Computer Science.
2. Seventh Message Understanding Conference (MUC-7). Morgan Kaufmann Publishers, San
Francisco, California, 1998
3. Gaizauskas, R., Humphreys, K., Demetriou, G.: Information Extraction from Biological Sci-
ence Journal Articles: Enzyme Interactions and Protein Strctures. Chemical Data Analysis
in the Large: The Challenge of the Automation Age, Martin G. Hicks (Ed.), Proceedings of
the Beilstein-Institut Workshop, May, 2000, Bozen , Italy
4. H. Cunningham, D. Maynard, K. Bontcheva, V. Tablan: GATE: A Framework and Graphical
Development Environment for Robust NLP Tools and Applications. Proceedings of the 40th
Anniversary Meeting of the Association for Computational Linguistics (ACL’02). Philadel-
phia, July 2002
5. D. Maynard, H. Cunningham, K. Bontcheva, R. Catizone, G. Demetriou, R. Gaizauskas, O.
Hamza, M. Hepple, P. Herring, B. Mitchell, M. Oakes, W. Peters, A. Setzer, M. Stevenson,
V. Tablan, C. Ursu, Y. Wilks: A Survey of Uses of GATE. Technical Report CS–00–06,
Department of Computer Science, University of Sheffield, 2000
6. R. Reis, J. Almeida: Etiquetador morfo-sint
´
actico para o Portugu
ˆ
es. In Actas do XIII En-
contro da Associa
˜
ao Portuguesa de Lingu
´
ıstica, Lisboa, Portugal, 1997, vol.2, pp. 209–222,
Associac
˜
ao Portuguesa de Lingu
´
ıstica
7. H. Cunningham, D. Maynard, K. Bontcheva, V. Tablan, and C. Ursu. 2002. The GATE User
Guide. http://gate.ac.uk/.
8. URL: http://linguateca.di.fc.ul.pt/harem.php
9. D. Jurafsky, J. H. Martin. An Introduction to Natural Language Processing, Computacional
Linguistics, and Speech Recognition, Upper Saddle River, New Jersey, 2000. Prentice Hall.
155