Skip to content

Latest commit

 

History

History
53 lines (40 loc) · 2 KB

File metadata and controls

53 lines (40 loc) · 2 KB

This folder contains example scripts.

  • To run the example of MOSES benchmark, you should first install molsets package by following the instruction here, then excute the python script as:
$ python run_moses.py --datadir={YOUR_MOSES_DATASET_FOLDER} --samplestep=100
  • To run the example of GuacaMol benchmark, you should install guacamol package first, then excute the python script as:
$ python run_guacamol.py --datadir={YOUR_GUACAMOL_DATASET_FOLDER} --samplestep=100
  • To run the example of ZINC250k benchmark, you should first download the dataset here, then excute the python script as :
$ python run_zinc250k.py --datadir={YOUR_ZINC250K_DATASET_FOLDER} --train_mode={normal,sar} --target={parp1,fa7,5ht1b,braf,jak2} --samplestep=1000

You can switch to the SELFIES version by using flag --version=selfies, but the package selfies is required.

JIT version?

Our implementation supports TorchScript.

import torch
from bayesianflow_for_chem import ChemBFN
from bayesianflow_for_chem.data import smiles2vec
from bayesianflow_for_chem.tool import sample, inpaint

model = ChemBFN.from_checkpoint("YOUR_MODEL.pt").eval().to("cuda")
model = torch.jit.freeze(torch.jit.script(model), ["sample", "inpaint", "ode_sample", "ode_inpaint"])
# or model.compile()
# ------- generate molecules -------
smiles = sample(model, 1, 60, 100)
# ------- inpaint (sacffold extension) -------
scaffold = r"Cc1cc(OC5)cc(C6)c1."
x = torch.tensor([1] + smiles2vec(scaffold) + [0] * (84 - len(scaffold)), dtype=torch.long)
x = x[None, ...].repeat(5, 1).to("cuda")
smiles = inpaint(model, x, 100)

SAR version?

Set model.semi_autoregressive = True before starting the training and/or sampling.

Enable LoRA parameters

from bayesianflow_for_chem import ChemBFN

model = ChemBFN.from_checkpoint("YOUR_MODEL.pt")
model.enable_lora(r=4, ...)