ArFakeDetect: A Deep Learning Approach for Detecting Fabricated Arabic Tweets on COVID-19 Vaccines

El-Mageed S.M.A.

Aboutabl A.E.

Mohamed E.H.

Artificial Intelligence

Healthcare

Energy and Water

Circuit Theory and Applications

Software and Communications

Social media platforms have emerged as major sources of false information, particularly regarding health topics. like COVID-19 vaccines. This rampant dissemination of inaccurate content contributes significantly to vaccine hesitancy and undermines vaccination campaigns. This research addresses the pressing need for automated methods to distinguish between factual and fabricated Arabic tweets concerning vaccines, aiming to mitigate the spread of misinformation on these platforms. The proposed approach utilizes deep learning techniques, leveraging pre-trained Arabic language models (Arabert) capture both semantic and sequential nuances in text. To tackle this issue, we leverage the largest manually annotated Arabic dataset, ArCovidVac, focusing on the COVID-19 vaccination discourse. Given the imbalanced distribution of labels within the dataset, we employ data augmentation techniques to ensure fair representation across all categories. Leveraging the power of transformers, the proposed model extracts robust and informative features from text data. Evaluation of a test set demonstrates a remarkable accuracy of 95/%, marking a notable 10% improvement over previous methodologies. This enhancement underscores the model's capacity to discern between factual and fake tweets effectively. We anticipate deploying this model for real-time analysis of vaccine-related Arabic tweets, furnishing timely insights for fact-checkers and public health authorities to combat the proliferation of misinformation. © 2024 IEEE.