-
Notifications
You must be signed in to change notification settings - Fork 7
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
fix bug feature_standardizer not applied
- Loading branch information
Showing
2 changed files
with
7 additions
and
39 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,49 +1,17 @@ | ||
# Change Log | ||
|
||
From v2.0.1 to v2.1.0.a2 | ||
From v2.1.0 to v2.1.1 | ||
|
||
## Fixes | ||
|
||
- fixed error with serialization of the `DataFrameDescriptorSet` (#63) | ||
- Papyrus descriptors are not fetched by default anymore from the `Papyrus` adapter, which caused fetching of unnecessary data. | ||
- A [potential bug in new version of pandas](https://github.com/pandas-dev/pandas/issues/55009) broke scaffold generation so a workaround was implemented. | ||
- ⚠️ Important! ⚠️ Fixed bug in `predictMols` where the `feature_standardizer` was | ||
not being applied to the calculated features. This bug was introduced in v2.1.0. | ||
Models trained with v2.1.0 are compatible with v2.1.1, make sure to update | ||
QSPRpred to v2.1.1 to ensure that the `feature_standardizer` is applied when | ||
predicting on new molecules. | ||
|
||
## Changes | ||
- `QSPRModel.evaluate` moved to a separate class `EvaluationMethod` in `qsprpred.models.interfaces`, with subclasses for cross-validation and making predictions on a test set in `qsprpred.models.evaluation_methods` (`CrossValidation` and `EvaluateTestSetPerformance` respectively). | ||
- `QSPRModel` attribute `scoreFunc` is removed. | ||
- 'qspr/models' is no longer added to the output path of `QSPRModel.save`, allowing for complete control over the output path. | ||
- `SKlearnMetrics.supportsTask` now uses a dictionary like dict[ModelTasks, list[str]] to map tasks to supported metric names. (#53) | ||
- `GBMTRandomSplit` and `ScaffoldSplit` now use the `GBMTDataSplit` to create balanced splits. `RandomSplit` still functions the same way as a completely random test split. | ||
- `PCMSplit` replaces `StratifiedPerTarget` and is compatible with `RandomSplit`, `ScaffoldSplit` and `ClusterSplit`. | ||
- `DuplicatesFilter` refactored to`RepeatsFilter`, as it also captures scenarios where triplicates/quadruplicates are found in the dataset. These scenarios are now also covered by the respective UnitTest. | ||
- The versioning scheme of development snapshots has changed from `devX` to `alphaX`/`betaX`, where `X` is an integer that increments with each release. | ||
- The following model class have been renamed and moved: | ||
- `models.models.QSPRsklearn` > `models.sklearn.SklearnModel` | ||
- `deep.models.QSPRDNN` > `extra.gpu.models.dnn.DNNModel` | ||
- `extra.models.pcm.ModelPCM` > `extra.models.pcm.PCMModel` | ||
- `extra.models.pcm.QSPRsklearnPCM` > `extra.models.pcm.SklearnPCMModel` | ||
- The command line interface modules now use input and output file paths instead | ||
of automatically placing all files in a subfolder `qspr`, allowing for more | ||
control over the output and input paths. | ||
|
||
## New Features | ||
- `GBMTDataSplit` - parent class to create globally balanced splits with the [gbmt-split](https://github.com/sohviluukkonen/gbmt-splits) package. | ||
- `ClusterSplit` - splits data based clustering of molecular fingerprints (uses `GBMTDataSplit`). | ||
- Raise error if search space for optuna optimization is missing search space type annotation or if type not in list. | ||
- When installing package with pip, the commit hash and date of the installation is saved into `qsprpred._version` | ||
- `HyperParameterOptimization` classes now accept a `evaluation_method` argument, which is an instance of `EvaluationMethod` (see above). This allows for hyperparameter optimization to be performed on a test set, or on a cross-validation set. (#11) | ||
- `HyperParameterOptimization` now accepts `score_aggregation` argument, which is a function that takes a list of scores and returns a single score. This allows for the use of different aggregation functions, such as `np.mean` or `np.median` to combine scores from different folds. (#45) | ||
- A new tutorial `adding_new_components.ipynb` has been added to the `tutorials` folder, which demonstrates how to add new model to QSPRpred. | ||
- A new function `Metrics.checkMetricCompatibility` has been added, which checks if a metric is compatible with a given task and a given prediction methods (i.e. `predict` or `predictProba`) | ||
- In `EvaluationMethod` (see above), an attribute `use_proba` has been added, which determines whether the `predict` or `predictProba` method is used to make predictions (#56). | ||
- Add new descriptorset `SmilesDesc` to use the smiles strings as a descriptor. | ||
- New module `early_stopping` with classes `EarlyStopping` and `EarlyStoppingMode` has been added. This module allows for more control over early stopping in models that support it. | ||
- Add new descriptorset `SmilesDesc` to use the smiles strings as a descriptor. | ||
- Refactoring of the test suite under `qsprpred.data` and improvement of temporary file handling (!114). | ||
- `PyBoostModel` - QSPRpred wrapper for py-boost models. Requires optional `pyboost` dependencies. | ||
- `ChempropModel` - QSPRpred wrapper for Chemprop models. Requires optional `deep` dependencies. | ||
- The `data_CLI` argument `--log_transform` (`-lt`) has been changed to `--transform_data` (`-t`), which now accepts a number of transformations to apply to the target data. Available transformations are `log`, `log10`, `log2`, `sqrt`, `cbrt`, `exp`, `exp2`, `exp10`, `square`, `cube`, `reciprocal`. | ||
- New `data_CLI`, `model_CLI` and `predict_CLI` argument `--skip_backup` (`-sb`) to skip the backup of the output files. WARNING: This will overwrite existing files. | ||
|
||
## Removed Features | ||
- `StratifiedPerTarget` is replaced by `PCMSplit`. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters