DeIDNER Model: A Neural Network Named Entity Recognition Model for Use in the De-identification of Clinical Notes

Mahanazuddin Syed, Kevin Sexton, Melody Greer, Shorabuddin Syed, Joseph VanScoy, Farhan Kawsar, Erica Olson, Karan Patel, Jake Erwin, Sudeepa Bhattacharyya, Meredith Zozus, Fred Prior

2022

Abstract

Clinical named entity recognition (NER) is an essential building block for many downstream natural language processing (NLP) applications such as information extraction and de-identification. Recently, deep learning (DL) methods that utilize word embeddings have become popular in clinical NLP tasks. However, there has been little work on evaluating and combining the word embeddings trained from different domains. The goal of this study is to improve the performance of NER in clinical discharge summaries by developing a DL model that combines different embeddings and investigate the combination of standard and contextual embeddings from the general and clinical domains. We developed: 1) A human-annotated high-quality internal corpus with discharge summaries and 2) A NER model with an input embedding layer that combines different embeddings: standard word embeddings, context-based word embeddings, a character-level word embedding using a convolutional neural network (CNN), and an external knowledge sources along with word features as one-hot vectors. Embedding was followed by bidirectional long short-term memory (Bi-LSTM) and conditional random field (CRF) layers. The proposed model reaches or overcomes state-of-the-art performance on two publicly available data sets and an F1 score of 94.31% on an internal corpus. After incorporating mixed-domain clinically pre-trained contextual embeddings, the F1 score further improved to 95.36% on the internal corpus. This study demonstrated an efficient way of combining different embeddings that will improve the recognition performance aiding the downstream de-identification of clinical notes.

Download


Paper Citation


in Harvard Style

Syed M., Sexton K., Greer M., Syed S., VanScoy J., Kawsar F., Olson E., Patel K., Erwin J., Bhattacharyya S., Zozus M. and Prior F. (2022). DeIDNER Model: A Neural Network Named Entity Recognition Model for Use in the De-identification of Clinical Notes. In Proceedings of the 15th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2022) - Volume 5: HEALTHINF; ISBN 978-989-758-552-4, SciTePress, pages 640-647. DOI: 10.5220/0010884500003123


in Bibtex Style

@conference{healthinf22,
author={Mahanazuddin Syed and Kevin Sexton and Melody Greer and Shorabuddin Syed and Joseph VanScoy and Farhan Kawsar and Erica Olson and Karan Patel and Jake Erwin and Sudeepa Bhattacharyya and Meredith Zozus and Fred Prior},
title={DeIDNER Model: A Neural Network Named Entity Recognition Model for Use in the De-identification of Clinical Notes},
booktitle={Proceedings of the 15th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2022) - Volume 5: HEALTHINF},
year={2022},
pages={640-647},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010884500003123},
isbn={978-989-758-552-4},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 15th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2022) - Volume 5: HEALTHINF
TI - DeIDNER Model: A Neural Network Named Entity Recognition Model for Use in the De-identification of Clinical Notes
SN - 978-989-758-552-4
AU - Syed M.
AU - Sexton K.
AU - Greer M.
AU - Syed S.
AU - VanScoy J.
AU - Kawsar F.
AU - Olson E.
AU - Patel K.
AU - Erwin J.
AU - Bhattacharyya S.
AU - Zozus M.
AU - Prior F.
PY - 2022
SP - 640
EP - 647
DO - 10.5220/0010884500003123
PB - SciTePress