Feature Selection for Sentiment Classification of COVID-19 Tweets: H-TFIDF Featuring BERT

Mehtab Alam Syed, Elena Arsevska, Mathieu Roche, Maguelonne Teisseire

2022

Abstract

In the first quarter of 2020, the World Health Organization (WHO) declared COVID-19 a public health emergency around the globe. Different users from all over the world shared their opinions about COVID-19 on social media platforms such as Twitter and Facebook. At the beginning of the pandemic, it became relevant to assess public opinions regarding COVID-19 using data available on social media. We used a recently proposed hierarchy-based measure for tweet analysis (H-TFIDF) for feature extraction over sentiment classification of tweets. We assessed how H-TFIDF and concatenation of H-TFIDF with bidirectional encoder representations from transformers (BH-TFIDF) perform over state-of-the-art bag-of-words (BOW) and term frequency-inverse document frequency (TF-IDF) features for sentiment classification of COVID-19 tweets. A uniform experimental setup of the training-test (90% and 10%) split scheme was used to train the classifier. Moreover, evaluation was performed with the gold standard expert labeled dataset to measure precision for each binary classified class.

Download


Paper Citation


in Harvard Style

Syed M., Arsevska E., Roche M. and Teisseire M. (2022). Feature Selection for Sentiment Classification of COVID-19 Tweets: H-TFIDF Featuring BERT. In Proceedings of the 15th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2022) - Volume 5: HEALTHINF; ISBN 978-989-758-552-4, SciTePress, pages 648-656. DOI: 10.5220/0010887800003123


in Bibtex Style

@conference{healthinf22,
author={Mehtab Alam Syed and Elena Arsevska and Mathieu Roche and Maguelonne Teisseire},
title={Feature Selection for Sentiment Classification of COVID-19 Tweets: H-TFIDF Featuring BERT},
booktitle={Proceedings of the 15th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2022) - Volume 5: HEALTHINF},
year={2022},
pages={648-656},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010887800003123},
isbn={978-989-758-552-4},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 15th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2022) - Volume 5: HEALTHINF
TI - Feature Selection for Sentiment Classification of COVID-19 Tweets: H-TFIDF Featuring BERT
SN - 978-989-758-552-4
AU - Syed M.
AU - Arsevska E.
AU - Roche M.
AU - Teisseire M.
PY - 2022
SP - 648
EP - 656
DO - 10.5220/0010887800003123
PB - SciTePress