SDF_Redundancy_Eliminator is a Python3 code to:
- Generate canonical SMILES for each of the compounds in an .SD/.SDF compound library,
- Detect redundant ligands/structural isomers in the library,
- Generate unique and redundant name lists and, optionally,
- Move redundant ligands/structural isomers to a separate file to produce a library of unique compounds.
- Python3
- RDKit
Option 1: Install RDKit with Conda if you have conda installed
Create a python3 environment (tested with python versions 3.8 and 3.7)
conda create -n py38_rdkit python=3.8
Install RDkit with pip using
pip install rdkit
or
pip install rdkit-pypi
Option 2: Install RDKit from repositories
sudo apt-get install python3-rdkit librdkit1 rdkit-data
or run
pip3 install rdkit
Option 3: Build from Source
You may follow the instructions here, or here
- Create a folder and copy in your .SD/.SDF compound library
- Copy RedundancyEliminator.py into the same directory
- Run the code and it will walk you through the steps:
python3 RedundancyEliminator.py
- Generates canonical SMILES for the compounds in the library if they are not annotated with SMILES strings
- Produces another copy of the library with SMILES string annotation
- Interactively walks the user through the steps
If you use this code in your work, kindly cite it as:
Yekeen, A. A. (2022). SDF_Redundancy_Eliminator: A python code to remove redundant ligands in a .SD/.SDF compound library. https://github.com/abeebyekeen/SDF_Redundancy_Eliminator, DOI: https://doi.org/10.5281/zenodo.7049711