Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize code #6

Open
mrshirts opened this issue Feb 3, 2022 · 5 comments
Open

Optimize code #6

mrshirts opened this issue Feb 3, 2022 · 5 comments

Comments

@mrshirts
Copy link

mrshirts commented Feb 3, 2022

Look at ways to optimize the derivative code (and other parts of the code) in python

@Yu-Tang-Lin
Copy link
Collaborator

Note for me:
supercell_generation.py
Line 41 mol_sc = next(pybel.readfile('pdb', path)) is really slow. (12 hours for extracting text from around 300 files)
Need to find a better algorithm to accelerate it later

@mrshirts
Copy link
Author

mrshirts commented Feb 9, 2022

Line 41 mol_sc = next(pybel.readfile('pdb', path)) is really slow. (12 hours for extracting text from around 300 files)
Need to find a better algorithm to accelerate it later.

Interesting. There should be some OpenFF tools for reading pdb files, I would think? Just to make sure it is this line, try just a single import - it should take 12 hours / 300 files = 2.4 min. We do want to try to use OpenFF tools (or RDKit?) to avoid extra external dependencies.

@Yu-Tang-Lin
Copy link
Collaborator

Yes, I am also confused by why it takes some long for reading the file.

Note for @Yu-Tang-Lin :
Another possible reason may be my GPU driver crush again. Even though I do think Line:41 will use GPU, it still is better to eliminate problems one by one.

@Yu-Tang-Lin
Copy link
Collaborator

Yu-Tang-Lin commented Feb 15, 2022

Note for @Yu-Tang-Lin :
All of the code including generating PDB file and OPENMM energy minimization is are using CPU to calculate, is that possible we use GPU to calculate?

I guess OPENMM energy minimization is possible since the tutorial also uses GPU to calculate.

@mrshirts
Copy link
Author

GPU optimization is the last thing to do. The reason for this is that it significantly complicates compilation and installation, and thus should be reserved as the last thing to do. Most important thing is to first push everything down to numpy and scipy operations, rather than be coded with explicit loops without using vector calls.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants