Covid-19 Vaccine Fake News Detector: Predicting whether news are true or false

Abstract

The emergence and spread of fake news during the Covid-19 pandemic inspired TechLabs Data Science Track students to develop an app to help people identify misleading information. For this purpose, two models were developed: a Machine Learning model and a Deep Learning model, which are available via a web-based app. The app can classify information on Covid-19 vaccines in the German language as true or false, allowing people to make better informed decisions based on verified facts. With the help of this app, misleading information can be identified, and the dissemination of rumors can be reduced.

Introduction

The Covid-19 pandemic went along with numerous articles and news items in traditional print media and electronic and social media. Information spreads rapidly and can come from trusted and untrusted medical sources [1]. Inaccurate information can be divided into misinformation, which is spread without the intention to mislead, and disinformation, which spreads false information with malicious intent to deceive [2, 3]. Fake news is misleading and deceptive news written and published to harm an organization or an individual [4]. An estimated 30 to 35 percent of disseminated news, videos, and photos on social media are fake [5]. This so-called “infodemic” has made reliable information harder to find and recognize, and rumors spread faster [3]. A large scale of fake news and misinformation led, among other things, to anti-vaccine protests and hesitancy [6]. The spread of inaccurate information about Covid-19 makes it difficult for the public to make informed decisions and thus threatens public health [7]. The aim of this work is to introduce a Machine Learning and Deep Learning approach to predict whether a piece of text information on the Covid-19 vaccine is true or false.

Method

A classification model with two classes (“true” or “false”) was trained to build the Fake News Detector App. 500 true and 500 false statements in German language about Covid-19 vaccines were collected in a Pandas DataFrame in Python. The criteria and sources for the true and false statements were defined in advance. The sentences in our database contain different kinds of punctuation, upper and lower case letters, German grammar, and specific medical terminology. Six preprocessing steps were performed for data manipulation. All punctuation marks and spaces were removed, and all letters were written in lowercase. We tokenized the words so the model could process each word (“token”) instead of an entire sentence. We then removed stop words using the nltk library before finally reducing all words to their stem as part of the lemmatization process using the spacy library.

Results

Both models provide a reliable result, with an accuracy of 84 percent in the Machine Learning model and 90 percent in the Deep Learning model (accuracy / f1-score). The results show that both Deep Learning and Machine Learning approaches can solve a natural language processing problem. The models were made accessible to external users via the web-based application Anvil. The application is directly linked to our models via Google Colab. The user can enter a sentence on Covid-19 vaccines into the input screen of the Anvil app. The system automatically performs all the preprocessing steps for the new sentence and generates a prediction. The prediction appears as a pop-up with the information about whether the entered sentence is true or false.

Outlook for future work

Although the accuracy of both models is satisfactory, we acknowledge that the database is not yet large enough to generalize our Fake News Detector. Further improvements would be to collect more data sets and train the model in more detail to make the result more valid. Another way to improve the model is to optimize its parameters through hyperparameter tuning. Doing so, we can further improve the model’s accuracy and, thus, its use and reliability.

Literature

[1] Khan et al., 2022. Detecting COVID-19-Related Fake News Using Feature Extraction. Front. Public Health 9:788074. doi: 10.3389/fpubh.2021.788074

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store