Skip to content

Latest commit

 

History

History
86 lines (57 loc) · 4.37 KB

README.md

File metadata and controls

86 lines (57 loc) · 4.37 KB

mdlearn

PyPI version Documentation Status

mdlearn is a Python library for analyzing molecular dynamics with machine learning. It contains PyTorch implementations of several deep learning methods such as autoencoders, as well as preprocessing functions which include the kabsch alignment algorithm and higher-order statistical methods like quasi-anharmonic analysis.

Currently supported models:

For more details and specific examples of how to use mdlearn, please see our documentation.

Table of Contents

  1. Installation
  2. Usage
  3. Contributing
  4. Acknowledgments
  5. License

Installation

Install latest version with PyPI

If you have access to an NVIDIA GPU, we highly recommend installing mdlearn into a Conda environment which contains RAPIDS to accelerate t-SNE computations useful for visualizing the model results during training. For the latest RAPIDS version, see here. If you don't have GPU support, mdlearn will still work on CPU by using the scikit-learn implementation.

Run the following commands with updated versions to create a conda environment:

conda create -p conda-env -c rapidsai -c nvidia -c conda-forge cuml=0.19 python=3.7 cudatoolkit=11.2
conda activate conda-env
export IBM_POWERAI_LICENSE_ACCEPT=yes
pip install -U scikit-learn

Then install mdlearn via: pip install mdlearn.

Some systems require PyTorch to be built from source instead of installed via PyPI or Conda, for this reason we made torch an optional dependency. However, it can be installed with mdlearn by running pip install 'mdlearn[torch]' for convenience. Installing this way will also install the wandb package. Please check that torch version >= 1.7.

Usage

Train an autoencoder model with only a few lines of code!

from mdlearn.nn.models.ae.linear import LinearAETrainer

# Initialize autoencoder model
trainer = LinearAETrainer(
    input_dim=40, latent_dim=3, hidden_neurons=[32, 16, 8], epochs=100
)

# Train autoencoder on (N, 40) dimensional data
trainer.fit(X, output_path="./run")

# Generate latent embeddings in inference mode
z, loss = trainer.predict(X)

Preprocessing

We provide a CLI for collecting common data products from simulations. Currently, we support the following preprocessing methods:

  • Coordinates
  • Contact maps
  • Root mean square deviation (RMSD)

Run the following command for details on how to use the CLI:

mdlearn preprocess --help

Contributing

Please report bugs, enhancement requests, or questions through the Issue Tracker.

If you are looking to contribute, please see CONTRIBUTING.md.

Acknowledgments

License

mdlearn has a MIT license, as seen in the LICENSE file.