Using Big Data and Genetic Algorithms to Predict the Behavior of Social Media Users through Logistic Regression

##plugins.themes.academic_pro.article.main##

root root

Abstract

Classifying accounts on Instagram as real or fake is essential for understanding and combating the phenomenon of online scams and fraud. This research utilizes the vast data available from Instagram to distinguish between real and fake accounts using binary logistic regression modeling. One of the prominent non-linear models for estimating logistic regression parameters was employed and enhanced using the Maximum Likelihood Estimation method, further optimized with a genetic algorithm. The Mean Squared Error (MSE) criterion was used to compare parameter estimation methods for logistic regression The results indicated that the enhanced Maximum Likelihood Estimation method performed best due to its lower MSE for estimators. In practical application, using the enhanced Maximum Likelihood Estimation method and data specific to real and fake pages revealed that factors such as the number of followers, bio length, link availability, profile picture availability, average number of hash tags, and time gap between posts were the most influential in determining account authenticity These factors proved most impactful in identifying whether an account was real or fake. The model achieved an accuracy of 80% for real accounts and 75% for fake accounts classification based on the data used.

##plugins.themes.academic_pro.article.details##

How to Cite
root, root. (2025). Using Big Data and Genetic Algorithms to Predict the Behavior of Social Media Users through Logistic Regression. Warith Scientific Journal, 7(23), 292-303. https://doi.org/10.57026/wsj.v7i23.534