If you have any problems with this installation, please file an issue and describe any problems, so we can improve the instructions.
Download the Python 3.7 Anaconda installer and run the Anaconda installer.
git clone https://github.com/sbl-sdsc/mmtf-genomics.git
cd mmtf-genomics
conda env create -f binder/environment.yml
conda activate mmtf-genomics
jupyter lab
conda deactivate
Anytime you want to use the environment, activate it again and start Jupyter Notebook
conda env remove -n mmtf-genomics
When running PySpark on many cores (e.g., > 8), the memory for the Spark Driver and Workers may need to be increased. If necessary, set the environmental variable SPARK_CONF_DIR
to the conf directory provided in this repository in your .bashrc (Linux) or .bash_profile (Mac) file.
export SPARK_CONF_DIR=<path>/mmtf-genomics/conf
Then close the terminal window and reopen it to set the environment variable.
The conf directory contains the file spark-env.sh with the following default settings.
SPARK_DRIVER_MEMORY=4G
SPARK_WORKER_MEMORY=4G
When running this repo on 24 core machine, you may need to increase the memory settings to 20G.