Note: I'm not a machine learning expert. This is my first machine learning project. Pull requests for improvements are welcome.
Copyright 2024 by Edwin Zimmerman MIT License
87072 malware binaries from virussign.com
6022 good binaries scraped from Windows and Linux
Trained on the 1st 40kb of each malware file.
git clone https://github.com/9cb14c1ec0/MalwareVision
cd MalwareVision
python3 -m venv .venv
source .venv/bin/activate
pip install tensorflow numpy keras
python3 classify.py
If you want to use an nvidia gpu you need to install the pip install tensorflow[and-cuda]
package via pip. Otherwise, the model will run on cpu.
The training script crawls /usr/bin
to get a linux binary sample. For training purposes, exe and dll files were collected from a Windows VM