-
Notifications
You must be signed in to change notification settings - Fork 20
Building reactions
The first step to building reactions in PIKAChU is to find the atoms or bonds that should be targeted. PIKAChU has two classes and two functions devoted to this: BondDefiner and GroupDefiner, and find_bonds and find_atoms.
from pikachu.reactions.functional_groups import BondDefiner, GroupDefiner, find_bonds, find_atoms
Both BondDefiner and GroupDefiner take a name and a SMILES string as first and second argument. The name is a string and user-defined, for instance 'peptide bond'. The SMILES string represents the substructure that will be searched for in the parent structure. For instance, the SMILES string 'C(=O)NC' represents a peptide bond.
The remaining arguments are integers pointing to the atoms involved in the bond or group. The integers point to the atoms in order of occurrence in their SMILES string, using 0-indexing. For instance, in the case of 'C(=O)NC', integers 0 and 2 would point to the first C and the first N.
In the case of the BondDefiner, two integers are passed, indicating two atoms adjacent to a bond of interest. For instance, if we would want to define a peptide bond using BondDefiner to later break it with hydrolysis, we would do so as follows:
peptide_bond = BondDefiner('peptide bond', 'C(=O)NC', 0, 2)
Sometimes, it may be desirable to define a bond adjacent to a hydrogen atom. This is possible, either by making the hydrogen explicit in the SMILES string, or by making use of the way in which PIKAChU numbers hydrogen atoms: hydrogens are added in the order of the atoms they are attached to:
n_h_bond = BondDefiner("n_h_bond", "[H]NCC(=O)", 0, 1)
print(n_h_bond.atom_1)
print(n_h_bond.atom_2)
----------------------
H_0
N_1
n_h_bond = BondDefiner("n_h_bond", "NCC(=O)", 0, 4)
print(n_h_bond.atom_1)
print(n_h_bond.atom_2)
----------------------
N_0
H_4
from pikachu.general import read_smiles
structure = read_smiles("NCC(=O)")
structure.print_graph()
-----------------------
{N_0: [C_1, H_4, H_5],
C_1: [N_0, C_2, H_6, H_7],
C_2: [C_1, O_3, H_8],
O_3: [C_2],
H_4: [N_0],
H_5: [N_0],
H_6: [C_1],
H_7: [C_1],
H_8: [C_2]}
Notice how the index '4', the first index not occurring in the SMILES string, is assigned the a hydrogen that is attached to the nitrogen atom which has index 0.
In contrast, GroupDefiner only takes a single integer, pointing at a single atom. For instance, if we were interested in methylating the nitrogen of the peptide backbone, we could define the nitrogen in peptide backbones but not in the side chains of lysines and ornithines as follows:
peptide_n = GroupDefiner('peptide N', 'NCC(=O)', 0)
Here, too, hydrogen atoms can be selected as group of interest if desired.
peptide_nh = GroupDefiner('peptide NH', 'NCC(=O)', 4)
peptide_nh = GroupDefiner('peptide NH', '[H]NCC(=O)', 0)
Next, occurrences of these bond and atom neighbourhoods can be easily detected in structures using the find_bonds and find_atoms functions respectively:
from pikachu.general import read_smiles
daptomycin = "CCCCCCCCCC(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N[C@H](CC(=O)N)C(=O)N[C@@H](CC(=O)O)C(=O)N[C@H]3[C@H](OC(=O)[C@@H](NC(=O)[C@@H](NC(=O)[C@H](NC(=O)CNC(=O)[C@@H](NC(=O)[C@H](NC(=O)[C@@H](NC(=O)[C@@H](NC(=O)CNC3=O)CCCN)CC(=O)O)C)CC(=O)O)CO)[C@H](C)CC(=O)O)CC(=O)C4=CC=CC=C4N)C"
daptomycin_structure = read_smiles(daptomycin)
peptide_bonds = find_bonds(peptide_bond, daptomycin_structure)
print(peptide_bonds)
--------------------
[single_10:C_9_N_11, single_27:C_24_N_26, single_36:C_33_N_35, single_45:C_42_N_44, single_56:N_54_C_55, single_61:N_59_C_60, single_66:N_64_C_65, single_70:N_68_C_69, single_75:N_73_C_74, single_80:N_78_C_79, single_85:N_83_C_84, single_90:N_88_C_89, single_94:N_92_C_93]
peptide_ns = find_atoms(peptide_n, daptomycin_structure)
print(peptide_ns)
-----------------
[N_11, N_26, N_35, N_44, N_54, N_59, N_64, N_68, N_73, N_78, N_83, N_88, N_92]
Next, the resulting atoms and bonds can be used in reaction pathways, where bonds are broken or created, and atoms added or removed. PIKAChU hosts a range of methods for this in the Structure class, including add_atom, make_bond, break_bond and remove_atom. These functions automatically update electron positions in valence shells. For more complicated reactions, such as the reduction of a double bond to a single bond, more fine-tuned control is needed. An example of such a reaction can be found in example_reactions/ketoreductase.py.
cd example_reactions
python ketoreductase.py
PIKAChU already boasts two in-built reactions: hydrolysis and condensation. These functions take Structure and Bond objects as input, and return the reaction products as Structure objects.
from pikachu.reactions.basic_reactions import hydrolysis
from pikachu.general import read_smiles, draw_structure
from pikachu.reactions.functional_groups import BondDefiner, find_bonds
smiles = "NCC(=O)NCC(=O)O"
structure = read_smiles(smiles)
draw_structure(structure)
peptide_bond = BondDefiner("peptide bond", "C(=O)N", 0, 2)
peptide_bonds = find_bonds(peptide_bond, structure)
structures = hydrolysis(structure, peptide_bonds[0])
for structure in structures:
draw_structure(structure)
from pikachu.reactions.basic_reactions import condensation
from pikachu.general import read_smiles, draw_structure
from pikachu.reactions.functional_groups import BondDefiner, find_bonds
glycine = "NCC(=O)O"
alanine = "NC(C)C(=O)O"
c_oh_bond = BondDefiner("c_oh_bond", "C(=O)O", 0, 2)
n_h_bond = BondDefiner("n_h_bond", "NCC(=O)", 0, 4)
structure_1 = read_smiles(glycine)
structure_2 = read_smiles(alanine)
c_oh_glycine = find_bonds(c_oh_bond, structure_1)[0]
n_h_alanine = find_bonds(n_h_bond, structure_2)[0]
product, water = condensation(structure_1, structure_2, c_oh_glycine, n_h_alanine)
draw_structure(product)