-
Notifications
You must be signed in to change notification settings - Fork 17
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Dev See merge request cdd/DrugEx!96
- Loading branch information
Showing
44 changed files
with
1,995 additions
and
3,803 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,52 +1,26 @@ | ||
# Change Log | ||
From v3.3.0 to v3.4.0 | ||
From v3.4.0 to v3.4.1 | ||
|
||
## Fixes | ||
|
||
None. | ||
|
||
- Content of output files during model training and molecule generation (broken due to refactoring in `v3.4.0`): | ||
- During fine-tuning, the training (`train_loss`) and the validation (`valid_loss`) loss, the rations of valid (`valid_ratio`) and accurate (`accurate_ratio`, only for transformers) molecules are saved in `_fit.tsv` | ||
- During RL, the rations of valid (`valid_ratio`), accurate (`accurate_ratio`, only for transformers), unique (`unique_ratio`) and desired (`desired_ratio`) molecules and the average arithmetic (`avg_amean`) and geometric (`avg_gmean`) of the modified scores are saved in `_fit.tsv` | ||
- In `DrugExEnvironment.getScores()` set all modified scores to 0 for invalid molecules (fixes bug resulting from refactoring in `v3.4.0`) | ||
- Fixed the CLI so that it supports new QSPRPred models. | ||
- Fixed the tutorial for scaffold-based generation. | ||
|
||
## Changes | ||
|
||
Major refactoring of `drugex.training` | ||
|
||
- Moving generators from `drugex.training.models` to `drugex.training.generators`, and harmonizing and renaming them | ||
- `RNN` -> `SequenceRNN` | ||
- `GPT2Model` -> `SequenceTransformer` | ||
- `GraphModel` -> `GraphTransformer` | ||
|
||
- Moving explorers from `drugex.training.models` to `drugex.training.explorers`, harmonizing and renaming them | ||
- `SmilesExplorerNoFrag` -> `SequenceExplorer` | ||
- `SmilesExplorer` -> `FragSequenceExplorer` | ||
- `GraphExplorer` -> `FragGraphExplorer` | ||
|
||
- Removal of all obsolete modules related to the two discontinued fragment-based LSTM models from [DrugEx v3](https://doi.org/10.26434/chemrxiv-2021-px6kz). | ||
|
||
- The generators' `sample_smiles()` has been replaced by a `generate()` function | ||
|
||
- Clafification of the terms qualifying the generated molecules to have the following unique and constant definitions (replacing ambigous `VALID` and `DESIRE` terms) | ||
- `Valid` : molecule can be parsed with rdkit | ||
- `Accurate` : molecule contains given input fragments | ||
- `Desired` : molecule fulfils all given objectives | ||
|
||
- Minimal supported version of QSPRPred compatible with the tutorial and CLI is now `v1.3.0.dev0`. | ||
- The `train` CLI script now uses the `'-p', '--predictor'` option to specify the QSPRPred model to use. It takes a path to the model's `_meta.json` file. More models can be specified this way. | ||
- This changes the original meaning of the `'-ta', '--active_targets'`, `'-ti', '--inactive_targets'` and `'-tw', '--window_targets'` options. These now serve to link the models to the particular type of target. The name of the QSPRPred model is used to determine the type of target it represents. For example, if the QSPRPred model is called `A2AR_RandomForestClassifier`, then the `'-ta', '--active_targets'` option will be used to link to the `A2AR_RandomForestClassifier` as a predictor predicting activity towards a target. | ||
- Standard crowding distance is now the default ranking method for the `train` script (equiv. to `--scheme PRCD`, previously was `--scheme PRTD`). | ||
|
||
- Revise implementation of Tanimoto distance-based Pareto ranking scheme(`SimilarityRanking`) to correspond to the method described in [DrugEx v2](https://doi.org/10.1186/s13321-021-00561-9). Add option to use minimum Tanimoto distance between molecules in a front instead the mean distance. | ||
|
||
- Remove all references to NN-based RAscore (already discontinued) | ||
|
||
Refactoring of CLI | ||
|
||
- Refactoring `dataset.py` and `train.py` to object based | ||
- Writting a single `.txt.vocab` file per dataset preprocessing instead of separate (duplicate) files for each subset in `dataset.py` | ||
|
||
## Removed | ||
|
||
- `--save_voc` argument in `dataset.py` as redundant | ||
- `--pretrained_model` argment in `train.py` (merged with `--agent_path`) | ||
- `memory` parameter and all associated code from in `SequenceRNN` | ||
## Removed Features | ||
|
||
None. | ||
|
||
## New Features | ||
|
||
- GRU-based RNN added to the CLI | ||
- added another possible implementation of similarity ranking (`MutualSimilaritySortRanking`), this is based on the code in the original repository of [DrugEx](https://github.com/XuhanLiu/DrugEx/blob/cd384f4a8ed4982776e92293f77afd4ea78644f9/utils/nsgaii.py#L92) | ||
None. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -5,4 +5,4 @@ | |
On: 24.06.22, 10:36 | ||
""" | ||
|
||
VERSION = "3.4.0" | ||
VERSION = "3.4.1" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.