NoLabs is an open source biolab that lets you run experiments with the latest state-of-the-art models and workflow engine for bio research.
The goal of the project is to accelerate bio research by making inference models easy to use for everyone. We are currently supporting protein workflow components (predicting useful protein properties such as solubility, localisation, gene ontology, folding, etc.), drug discovery components (construct ligands and test binding to target proteins) and small molecules design components (design small molecules given a protein target and check drug-likeness and binding affinity).
We are working on expanding both and adding cell and genetic components, and we will appreciate your support and contributions.
Let's accelerate bio research!
Workflow Engine:
- Create workflows combining different models and data
- Schedule jobs and observe results for big data processing
- Adjust input parameters for particular jobs
Bio Buddy - drug discovery co-pilot:
BioBuddy is a drug discovery copilot that supports:
- Downloading data from ChemBL
- Downloading data from RcsbPDB
- Questions about drug discovery process, targets, chemical components etc
- Writing review reports based on published papers
For example, you can ask
- "Can you pull me some latest approved drugs?"
- "Can you download me 1000 rhodopsins?"
- "How does an aspirin molecule look like?" and it will do this and answer other questions.
To enable biobuddy run this command when starting nolabs:
$ ENABLE_BIOBUDDY=true docker compose up nolabs mongo redis
# mongo is required
And also start the biobuddy microservice:
$ OPENAI_API_KEY=your_openai_api_key TAVILY_API_KEY=your_tavily_api_key docker compose up biobuddy
Nolabs is running on GPT4 for the best performance. You can adjust the model you use in microservices/biobuddy/biobuddy/services.py
You can ignore OPENAI_API_KEY warnings when running other services using docker compose.
# Clone this project
$ git clone https://github.com/BasedLabs/nolabs
$ cd nolabs
Generate a new token for docker registry https://github.com/settings/tokens/new Select 'read:packages'
$ docker login ghcr.io -u username -p ghp_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
If you want to run a single feature (recommended)
$ docker compose up nolabs mongo redis
# mongo and redis are required
$ docker compose up esmfold_light
$ docker compose up diffdock
$ docker compose up p2rank
...
OR if you want to run everything on one machine:
$ docker compose up
Server will be available on http://localhost:9000
We provide individual Docker containers backed by FastAPI for each feature, which are available in the /microservices
folder. You can use them individually as APIs.
For example, to run the esmfold
service, you can use Docker Compose:
$ docker compose up esmfold
Once the service is up, you can make a POST request to perform a task, such as predicting a protein's folded structure. Here's a simple Python example:
import requests
# Define the API endpoint
url = 'http://127.0.0.1:5736/run-folding'
# Specify the protein sequence in the request body
data = {
'protein_sequence': 'YOUR_PROTEIN_SEQUENCE_HERE'
}
# Make the POST request and get the response
response = requests.post(url, json=data)
# Extract the PDB content from the response
pdb_content = response.json().get('pdb_content', '')
print(pdb_content)
This Python script makes a POST request to the esmfold microservice with a protein sequence and prints the predicted PDB content.
Since we provide individual Docker containers backed by FastAPI for each feature, available in the /microservices
folder, you can run them on separate machines. This setup is particularly useful if you're developing on a computer
without GPU support but have access to a VM with a GPU for tasks like folding, docking, etc.
For instance, to run the diffdock
service, use Docker Compose on the VM or computer equipped with a GPU.
On your server/VM/computer with a GPU, run:
$ docker compose up diffdock
Once the service is up, you can check that you can access it from your computer by navigating to http://< gpu_machine_ip>:5737/docs
If everything is correct, you should see the FastAPI page with diffdock's API surface like this:
Next, update the nolabs/infrastructure/appsettings.local.json
file on your primary machine to include the IP address of the
service (replace 127.0.0.1 with your GPU machine's IP):
...
"p2rank": {
"microservice": "http://127.0.0.1:5731"
},
"msa_light": {
"microservice": "http://127.0.0.1:5734",
"msa_server_url": "http://207.246.89.242:8000/generate-msa"
},
"umol": {
"microservice": "http://127.0.0.1:5735"
},
"diffdock": {
"microservice": "http://127.0.0.1:5737" -> http://74.82.28.227:5737
}
...
And now you are ready to use this service hosted on a separate machine!
Model: RFdiffusion
RFdiffusion is an open source method for structure generation, with or without conditional information (a motif, target etc).
docker compose up protein_design
Swagger UI will be available on http://localhost:5789/docs
or install as a python package
Model: ESMFold - Evolutionary Scale Modeling
docker compose up esmfold
Swagger UI will be available on http://localhost:5736/docs
or install as a python package
Model: ESMAtlas
docker compose up esmfold_light
Swagger UI will be available on http://localhost:5733/docs
or install as a python package
Model: Hugging Face
docker compose up gene_ontology
Swagger UI will be available on http://localhost:5788/docs
or install as a python package
Model: Hugging Face
docker compose up localisation
Swagger UI will be available on http://localhost:5787/docs
or install as a python package
Model: p2rank
docker compose up p2rank
Swagger UI will be available on http://localhost:5731/docs
or install as a python package
Model: Hugging Face
docker compose up solubility
Swagger UI will be available on http://localhost:5786/docs
Model: DiffDock
docker compose up diffdock
Swagger UI will be available on http://localhost:5737/docs
Model: RoseTTAFold
docker compose up rosettafold
Swagger UI will be available on http://localhost:5737/docs
WARNING: To use Rosettafold you must change the volumes '.' to point to the specified folders.
Model: REINVENT4
Misc: DockStream, QED, AutoDock Vina
docker compose up reinvent
Able to perform zero-shot cell type classification and getting embeddings of cells based on their genes.
docker compose up sc_gpt
Swagger UI will be available on http://localhost:5790/docs
Service that allows users to perform searches using various BLAST databases. It supports nucleotide and protein BLAST queries through different endpoints.
docker compose up blast_query
Swagger UI will be available on http://localhost:5743/docs
WARNING: Do not change the number of guvicorn workers (1), this will lead to microservice issues.
The following tools were used in this project:
[Recommended for laptops] If you are using a laptop, use --test
argument (no need to have a lot of compute):
- RAM > 16GB
- [Optional] GPU memory >= 16GB (REALLY speeds up the inference)
[Recommended for powerful workstations] Else, if you want to host everything on your machine and have faster inference (also a requirement for folding sequences > 400 amino acids in length):
- RAM > 30GB
- [Optional] GPU memory >= 40GB (REALLY speeds up the inference)