This repository contains the replication package and dataset of the paper published at ISSRE 2022 with the title Explainable AI for Android Malware Detection: Towards Understanding Why the Models Perform So Well?
For more information, interested researchers can contact us by sending an email to [email protected]. The full dataset is available below.
In this work, we replicate three high-profile ML-based Android malware detection approaches.
- Drebin: pdf, reproduction code
- XMal: pdf, reproduction code
- Fan et al.: pdf, reproduction code for models, reproduction code for LIME
The data folder contains the metadata of Android apps from AndroZoo. AndroZoo is a online collection of Android Applications collected from several sources, including the official Google Play app market. To how to download the dataset, please visit AndroZoo API Documentation.
The Android samples span across a 10-year period from 2011 to 2020. The dataset is divided into two parts: benign and malicious samples. We put the metadata (e.g., sha256, md5, market, package name, size) of the samples from different periods into different folders.
For the ground-truth of temporal information of features (i.e., permission and API calls), the researchers can refer the recent Android Developer Documentation link. We put the ground-truth of Android SDK 30 (i.e., data/api-versions.xml) into the data folder.