Skip to content

mtap-research/PACMAN-charge

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

47 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

PACMAN

A Partial Atomic Charge Predicter for Porous Materials based on Graph Convolutional Neural Network (PACMAN)

Requires Python 3.9PyPI version Zenodo MIT Gmail Linux Windows

Developed by: Guobin Zhao

Installation

pip

pip install PACMAN-charge

Git clone

git clone https://github.com/mtap-research/PACMAN-charge.git
cd PACMAN-charge
pip install -r requirements.txt

How to Use PACMAN charge

Jupyter notebook (using pip)

from PACMANCharge import pmcharge
pmcharge.predict(cif_file="./test/Cu-BTC.cif",charge_type="DDEC6",digits=10,atom_type=True,neutral=True,keep_connect=True)

Terminal

python pmcharge.py folder-name[path] --charge_type[DDEC6/Bader/CM5/REPEAT] --digits[int] --atom_type[bool] --neutral[bool] --keep_connect[bool]

Example command: python pmcharge.py test_file/test-1/ --charge_type DDEC6 --digits 10

Help usage information: python pmcharge.py -h

  • folder-name: relative path to a folder with cif files without partial atomic charges
  • charge-type (default: DDEC6): DDEC6, Bader, CM5 or REPEAT
  • digits (default: 6): number of decimal places to print for partial atomic charges. ML models were trained on a 6-digit dataset
  • atom-type (default: True): Default is to keep the same partial atomic charge for the same atom types (based on the similarity of partial atomic charges up to 3 decimal places)
  • neutral (default: True): Default is to keep the net charge is zero. We use "mean" method to neuralize the system where the excess charges are equally distributed across all atoms
  • keep_connect (default: True): Retain the atomic and connection information (such as _atom_site_adp_type, bond) for the structure

Website & Zenodo

  • Predict partial atomic charges using an online APP πŸ‘‰ link
  • Full code and dataset can be downloaded from πŸ‘‰ link
  • Note: All future releases will be uploaded on Github and pip only

Reference

If you use PACMAN charge, please consider citing this paper:

@article{,
    title={PACMAN: A Robust Partial Atomic Charge Predicter for Nanoporous Materials based on Crystal Graph Convolution Network},
    DOI={10.1021/acs.jctc.4c00434},
    journal={Journal of Chemical Theory and Computation},
    author={Zhao, Guobin and Chung, Yongchul},
    year={2024},
    volume = {20},
    number = {12},
    pages={5368-5380}
}
Databases with partial atomic charges url size
QMOF link 16,779
CoRE MOF 2014 DDEC link 2,932
CoRE MOF 2014 DFT-optimized link 502
CURATED-COFs link 612
ARC-MOF link 279,118

Bugs and Issues

If you encounter any problem during using PACMAN, please email [email protected] or create "issues"

Repository Structures

Model Architecture

workflow

Directory Organization

.
β”œβ”€β”€ ..
β”œβ”€β”€ figs                                                # Figures used for introduction 
β”‚   β”œβ”€β”€ toc.jpg                                         # Table of Contents
β”‚   └── workflow.png                                    # Workflow of this project
β”‚
β”œβ”€β”€ model                                               # Python files used for dataset prepartion & GCN training
β”‚   β”œβ”€β”€ GCN_E.py                                        # Networks model for energy/bandgap training
β”‚   β”œβ”€β”€ GCN_charge.py                                   # Networks model for atomic charge training
β”‚   β”œβ”€β”€ cif2data.py                                     # Convert QMOF database to dataset
β”‚   β”œβ”€β”€ data_E.py                                       # Convert cif to graph & target (energy/bandgap)
β”‚   β”œβ”€β”€ data_charge.py                                  # Convert cif to graph & target (atomic charge)
β”‚   └── utils.py                                        # Normalizer, sampling, AverageMeter, save_checkpoint
β”‚
β”œβ”€β”€ model4pre                                           # Python files used for prediction
β”‚   β”œβ”€β”€ GCN_E.py                                        # Networks model for energy/bandgap prediction
β”‚   β”œβ”€β”€ GCN_charge.py                                   # Networks model for atomic charge prediction
β”‚   β”œβ”€β”€ atom_init.json                                  # a JSON file that stores the initialization vector for each element
β”‚   β”œβ”€β”€ cif2data.py                                     # Read/write cif file
β”‚   β”œβ”€β”€ data.py                                         # Convert cif to graph & target (energy/bandgap)
β”‚   β”œβ”€β”€ data_charge.py                                  # Convert cif to graph & target (atomic charge)
β”‚   └── utils.py                                        # Normalizer, sampling, AverageMeter, save_checkpoint
β”‚
β”œβ”€β”€ pth                                                 # Models of this project
β”‚   β”œβ”€β”€ best_bader                                      # Bader
β”‚   β”‚   β”œβ”€β”€ bader.pth                                   # Bader charge model
β”‚   β”‚   └── normalizer-bader.pkl                        # Normalizer of bandgap
β”‚   β”œβ”€β”€ best_bandgap                                    # Bandgap
β”‚   β”‚   β”œβ”€β”€ bandgap.pth                                 # Bandgap model
β”‚   β”‚   └── normalizer-bandgap.pkl                      # Normalizer of bandgap
β”‚   β”œβ”€β”€ best_cm5                                        # CM5
β”‚   β”‚   β”œβ”€β”€ bandgap.pth                                 # ///
β”‚   β”‚   └── normalizer-bandgap.pkl                      # ///
β”‚   β”œβ”€β”€ best_ddec                                       # ///
β”‚   β”‚   β”œβ”€β”€ ddec.pth                                    # ///
β”‚   β”‚   └── normalizer-ddec.pkl                         # ///
β”‚   β”œβ”€β”€ best_pbe                                        # ///
β”‚   β”‚   β”œβ”€β”€ pbe-atom.pth                                # ///
β”‚   β”‚   └── normalizer-pbe.pkl                          # ///
β”‚   β”œβ”€β”€ best_repeat                                     # ///
β”‚   β”‚   β”œβ”€β”€ repeat.pth                                  # ///
β”‚   β”‚   └── normalizer-repeat.pkl                       # ///
β”‚   β”œβ”€β”€ chk_bader                                       # Bader
β”‚   β”‚   └── checkpoint.pth                              # Checkpoint of bader
β”‚   β”œβ”€β”€ chk_bandgap                                     # Bandgap
β”‚   β”‚   └── checkpoint.pth                              # Checkpoint of bandgap
β”‚   β”œβ”€β”€ chk_cm5                                         # CM5
β”‚   β”‚   └── checkpoint.pth                              # ///
β”‚   β”œβ”€β”€ chk_ddec                                        # ///
β”‚   β”‚   └── checkpoint.pth                              # ///
β”‚   β”œβ”€β”€ chk_pbe                                         # ///
β”‚   β”‚   └── checkpoint.pth                              # ///
β”‚   └── chk_repeat                                      # ///
β”‚       └── checkpoint.pth                              # ///
β”‚
β”œβ”€β”€ pmcharge.py                                         # main python file for atomic charge assignment by command line
β”œβ”€β”€ LICENSE.txt                                         # MIT license
β”œβ”€β”€ README.md                                           # Usage/Source
β”œβ”€β”€ requirements.txt                                    # packages need to be installed
β”œβ”€β”€ train_E.py                                          # main python file for energy/bandgap training
└── train_charge.py                                     # main python file for atomic charge training

Supported elements

(Elements that have been used by the model training process, not all the elements contained in the database)

  • DDEC6/CM5/Bader Charges

QMOF-Element

  • REPEAT Charges

ARC-MOF-Element

Models Performance

  • DDEC6 Charges
    Parity plot of partial atomic charges from DDEC6 and PACMAN on the test set (QMOF).

DDEC6

  • CM5 Charges
    Parity plot of partial atomic charges CM5 and PACMAN on the test set (QMOF).

CM5

  • Bader Charges
    Parity plot of partial atomic charges from Bader and PACMAN on the test set (QMOF).
    For the Bader model, use caution with Th-MOF predictions due to just 2 points used in traning set. The big error shows in the below figure is Th.

Bader

  • REPEAT Charges
    Parity plot of partial atomic charges from REPEAT and PACMAN on the test set (ARC-MOF).

REPEAT

AUTHORS

Maintainer

Project Contributors

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages