Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalize download_opls_xml Function #1054

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

shehan807
Copy link

@shehan807 shehan807 commented Nov 12, 2024

Summary

Include a summary of major changes in bullet points:

  • Feature 1: src/atomate2/openmm contains utils.py, for which the LigParGen server is accessed; currently, the download_opla_xml function allows only SMILES string inputs. This feature generalizes this by modifying the input dictionary (dict[str, str]) to a dictionary of dictionaries (dict[str, dict[str, str]]), where the new input handles the molecule's charge and number of optimization iterations. A working dictionary, then, follows:
mols = {
        'benzene':{
            'smiles':'c1ccccc1'
            },
        'TMA':{
            'smiles':'C[N+](C)(C)C',
            'checkopt':3,
            'charge':"+1"
            }
        }
  • Fix 1: The LigParGen server output creates an .XML file that results in an error while copying it to the new file, so shutil.copy(file, final_file) resolves this issue.

Additional dependencies introduced (if any)

  • shutil: code like Path(file).rename(final_file) can fail for environments where the .XML file from LigParGen is created in a /tmp folder. shutil allows a standard copy; it is possible that, in a high-throughput test case, this results in performance loss.
  • selenium.webdriver.support.ui.WebDriverWait and selenium.webdriver.support.expected_conditions: both are from selenium and safeguard from LigParGen server crashes

TODO (if any)

If this is a work-in-progress, write something about what else needs to be done.

  • Feature 1 supports utils.py, but has not been updated in tests/openmm_md/test_utils.py.

Checklist

Work-in-progress pull requests are encouraged, but please put [WIP] in the pull request
title.

Before a pull request can be merged, the following items must be checked:

  • Code is in the standard Python style.
    The easiest way to handle this is to run the following in the correct sequence on
    your local machine. Start with running ruff and ruff format on your new code. This will
    automatically reformat your code to PEP8 conventions and fix many linting issues.
  • Doc strings have been added in the Numpy docstring format.
    Run ruff on your code.
  • Type annotations are highly encouraged. Run mypy to
    type check your code.
  • Tests have been added for any new functionality or bug fixes.
  • All linting and tests pass.

Note that the CI system will run all the above checks. But it will be much more
efficient if you already fix most errors prior to submitting the PR. It is highly
recommended that you use the pre-commit hook provided in the repository. Simply run
pre-commit install and a check will be run prior to allowing commits.

@utf
Copy link
Member

utf commented Nov 12, 2024

Thanks @shehan807. This looks good to me. Can you install and run the linter on your changes to ensure they match the style guidelines: https://materialsproject.github.io/atomate2/dev/dev_install.html#installing-pre-commit

pip install pre-commit
pre-commit run --all

@shehan807
Copy link
Author

Hi @utf, I just wanted to follow up on this. It seems like the only issue is regarding the changes I've made using the time.sleep function--all I've done is added a dependence to the number of optimization iterations selected in LigParGen since this may increase the time it takes to output .xml/.pdb files.

On a slightly separate note, I wanted to raise the issue regarding version control of LigParGen. Based on my issue on the LigParGen repository (Isra3l/ligpargen#31 (comment)), I learned that the server has only kept up with BOSS v4.9 (the program managed by the Jorgenson group that operates under the hood for LigParGen). Perhaps @orionarcher could comment here, since in the case of high throughput OPLS-AA simulations, I wonder if extending the current download_opls_xml function to interface with the BOSS source code is a feasible option. The BOSS v5.1 source code is available online (https://zarbi.chem.yale.edu/software.html), but I am unsure how licensing works with incorporating it into atomate2. I'm happy to contribute to whatever extent this may be useful!

@@ -86,7 +110,7 @@ def download_opls_xml(

file = next(Path(tmpdir).iterdir())
# copy downloaded file to output_file using os
Path(file).rename(final_file)
shutil.copy(file, final_file)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WIll this change leave extra files in the downloads directory?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's correct! I will change this to shutil.move which should avoid and extra files in the download_dir.

@orionarcher
Copy link
Contributor

One comment but LGTM, thanks @shehan807.

It should be fine to build an integration that works if BOSS is available as an executable. Atomate2 has integrations with other closed-source codes like VASP.

@shehan807
Copy link
Author

The changes in this PR are now merged with #1111

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants