Welcome to TrustLink, a project developed during the Delhi Police Cyber Hackathon, aimed at enhancing online security by detecting and safeguarding against deceptive URLs. TrustLink leverages machine learning models, data analysis, and dynamic classification techniques to provide users with a reliable solution to identify and avoid malicious links, contributing to a safer online experience.
TrustLink utilizes a combination of static and dynamic analysis to examine URLs for potential threats, categorizing them into labels such as phishing, malware, benign, or defacement. The project incorporates diverse data sources, including curated host lists and a pre-trained text classification model, to offer a robust defense against deceptive URLs.
- Flask: Python-based web framework for developing the backend logic and the API of the TrustLink project.
- Transformers Library: Utilized for the ML model, providing a pre-trained text classification model for analyzing URLs.
- Python: Primary programming language for scripting and backend development.
- Streamlit: Utilized for the web application, allowing users to input URLs and receive classification results.
- Tampermonkey Script: A Tampermonkey script is provided for enabling real-time threat detection directly in the browser.
- User Input: Users input a URL into the TrustLink web application or Use Tampermonkey Chrome extension for automatic detection and blocking.
- Static Analysis: Comparison against pre-loaded data from various host lists to identify patterns associated with malicious behavior.
- Dynamic Analysis: Utilization of a pre-trained text classification model for dynamic analysis if the URL is not found in host lists.
- Classification Results: Display of classification results on the webapp, including labels such as phishing, malware, benign, or defacement, along with corresponding scores.
- Clone the repository.
- Install the required dependencies using
requirements.txt
. - Run the Flask API (
flask_api.py
) to set up the backend logic for URL classification. - Run the Streamlit app (
streamlit_app.py
) to input a URL and view the classification results. - Optionally, install the Tampermonkey script in your browser to experience real-time threat detection.
For a detailed overview of the TrustLink project, including its objectives, workflow, technology stack, and future aspects, please refer to the Delhi Police Cyber Hackathon Project Presentation (PDF).
TrustLink aims to expand its capabilities in the following areas:
- Protection against Typosquatting attacks.
- Protections against IDN Homograph attacks.
- Enriching the machine learning dataset with additional features, such as comprehensive Whois information and the age of the website.
We extend our gratitude to the Delhi Police Cyber Hackathon for providing a platform to develop and showcase TrustLink, as well as to all organizations and individuals contributing to the project's datasets and resources.
This project is licensed under the GNU General Public License v3.0.