A python script for generating descriptors from a directory of .mol files
This script requires the RDKit and Mordred packages to be installed. This is best done with conda:
conda install -c rdkit -c mordred_descriptors mordred rdkit
Invoke on a folder with .mol
files to get rdkit descriptors in a CSV
minimum example:
python ./free_descriptors.py -i /path/to/molfiles -o output_name
will read all .mol
files in /path/to/molfiles
and create output_name.csv
inside the same folder.
Several options exist to compute additional descriptors:
- fragments:
-f
or--fragments
flag - MACCS keys:
-M
or--MACCS
flag - ECFP6 fingerprints:
-E
or--ECFP6
flag - Mordred descriptors:
-m
or--mordred
flag - Macrocycle descriptors:
-c
or--macrocycle
flag Optional descriptors have a tendency to take a long time to compute and to fail on some molecules. Be sure to check output files.
- This script is a modification of the rdkit_descriptors.py script by Petr Škoda
- This script also includes code from Phyo Phyo Kyaw Zin article