Societal Transformation: AI and Big Data Journal

Fake news detection with Urdu data set

Research Article 39
- Volume 2, Issue 2 2024
By Umer Bin Siddiq, Muhammad Bilawal Raheem
Keywords: fake news dataset, machine learning, decision tree, SVM, logistic regression

This paper investigates how fake news in Urdu language can be identified on social media content. There has been increasing usage of social media and the new is ubiquitous and spread daily over the social media. It is very difficult to identify original news from a fake news. As we all know, billions of users are using social media from all over the world and they spread all types of news. It is very difficult to identify if a news is fake or real and this poses serious concerns as the ultimate objective of the original poster may be malicious. To resolve this issue, this paper used machine learning techniques for detection of fake news. Dataset of various types were analyzed, and then Fire2021 dataset from Kaggle was selected, and algorithms were trained to automatically detect a fake news. The proposed approach comprises multiple phases. The first phase comprises steps for preprocessing of the dataset. It attempts to validate the authenticity of news content. For this purpose, it employs several linguistic features. A number of various machine learning algorithms such as Multinomial Naive Bayes, Support Vector machine classifier (SVC), Bernoulli Naïve Bayes (BNB), Logistic Regression (LR), Decision Tree (DT), Random Forest(RF) and AdaBoost have been used over the fake news benchmark datasets. In addition, k-fold cross-validation is performed to obtain the best accuracy. The paper reports various performance metrics for these algorithms. Ada Boost is found to give the best results in terms of precision, recall, accuracy and F1-score.

Share this paper


Want to publish in ?
Send us your paper for review
0
Authors
19
Research Papers
0
Citations