Fake Twitter Account Detection

The Problem

The omnipresence of bots is not foreign to the Twitter community. Currently, it is estimated that 20% - 29% of content in the US on Twitter is generated by bots (Varanasi, 2022). Some of these bots are harmless, but there exist bots that engage in various fraudulent activities - which is broadly defined as wrongful or criminal deception to result in financial or personal gain. Some examples include manipulating election votes (Metz, 2020) and cryptocurrency scams (Perez, 2022). Ultimately, these fraudulent bots need to be detected fast, and punished accordingly before they bring more harm to users.

Our Solution

Currently, Twitter is culling 1 million bot accounts per day (Sutcliffe, 2022). However, this is far from enough, as bots continue to plague the Twitter space. Furthermore, Twitter admits that fraudulent bot detection is a highly complex and nuanced problem (Twitter, 2021). Therefore, we propose a data-driven approach using a mix of traditional machine learning and neural networks to tackle the uncertainty and complexity of fraudulent bot detection. For simplicity, we shall refer to the fraudulent bots as bots in the report.

Introduction to Codes

scrape_profile_pic.ipynb
- Scrape the profile picture of Twitter Users
Data Cleaning.ipynb
- Code to clean data based on the file in scrape_profile_pic.ipynb. E.g. Removal of invalid rows and columns
Get Face.ipynb
- Read the profile picture of Twitter Users to detect the presence of faces
Graph.ipynb
- Create the reciprocity feature of users, based on a graph structure
Feature Engineering.ipynb
- Feature Engineering based on the files generated in Data Cleaning.ipynb, Get Face.ipynb, and Graph.ipynb
All notebooks in /Traditional Models and /Neural Networks
- Each notebook contains code that trains different ML/NN models and evaluate performance

Name		Name	Last commit message	Last commit date
Latest commit History 144 Commits
.ipynb_checkpoints		.ipynb_checkpoints
Neural Networks		Neural Networks
Traditional Models		Traditional Models
data		data
.DS_Store		.DS_Store
.gitignore		.gitignore
Data Cleaning.ipynb		Data Cleaning.ipynb
Feature Engineering.ipynb		Feature Engineering.ipynb
Get Faces.ipynb		Get Faces.ipynb
Graph.ipynb		Graph.ipynb
README.md		README.md
Scrape Profile Pic.ipynb		Scrape Profile Pic.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fake Twitter Account Detection

The Problem

Our Solution

Introduction to Codes

About

Releases

Packages

Contributors 5

Languages

bandytan/Fake-Twitter-Account-Detection

Folders and files

Latest commit

History

Repository files navigation

Fake Twitter Account Detection

The Problem

Our Solution

Introduction to Codes

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages