Releases: scikit-adaptation/skada
0.4.0
Skada v0.4.0 Release Highlights
This update brings significant enhancements and new features:
- New Shallow Methods: MongeAlignment and JCPOT
- New Deep Methods: CAN, MCC, MDD, SPA, SourceOnly, and TargetOnly models.
- Scorers: Introduced MixValScorer and improved scorer compatibility with deep models.
- Subsampling Transformers: Added StratifiedDomainSubsampler and DomainSubsampler.
- Deep Models: Enhanced batch handling, fixed predict_proba, stabilized MDD loss, and fixed Deep Coral.
- Docs & Design: Added a contributor guide, new logo, and documentation updates.
What's Changed
- Update README.md with zenodo badge by @rflamary in #216
- [MRG] Add multi-domain Monge alignment and JCPOT Target shift method by @rflamary in #180
- [MRG] Add a parameter base_criterion to deep models by @tgnassou in #217
- [MRG] Add new scorer: MixValScorer by @YanisLalou in #221
- [MRG] Fix mixval by @antoinecollas in #222
- [MRG] Fix batch issue when generating features + add sample_weight in deep models by @YanisLalou in #220
- [MRG] Allow model selection cv to handle nd inputs by @YanisLalou in #225
- [MRG] In DEV, reshape features to 2D instead of input by @YanisLalou in #226
- [MRG] Add utilities functions to the doc by @antoinecollas in #227
- Add new logo! by @tgnassou in #223
- Fix ImportanceWeightedScorer compatibility with deep learning models by @YanisLalou in #232
- [MRG] fix param for Deepjdot by @tgnassou in #234
- [MRG] Add SourceOnly and TargetOnly models by @tgnassou in #233
- [MRG] Fix docstring for the regulariation parameter of DA loss by @tgnassou in #230
- [MRG] Fix order of feature acquisition for deep module by @tgnassou in #235
- [MRG] Add recentering in DeepCoral by @tgnassou in #242
- [MRG] Add DomainOnlySampler and DomainOnlyDataloader for SourceOnly ou TargetOnly deep methods by @tgnassou in #243
- [MRG] Modify sampler to take the max of the two domains by @tgnassou in #241
- Fix: Dev scorer wasn't working with SourceOnly and TargetOnly by @YanisLalou in #244
- [MRG] Fix deep coral by @antoinecollas in #246
- [MRG] Harmonize fixtures by @antoinecollas in #248
- [MRG] Bug fix when None in make_da_pipeline by @antoinecollas in #256
- [MRG] Handle edge case Mixvalscorer by @YanisLalou in #257
- [MRG] Add CAN Method by @YanisLalou in #251
- [MRG] Uncomment MMDTarSReweightAdapter tests by @YanisLalou in #260
- [MRG] Enhancements to DomainAwareNet and Scorers to handle
allow_source
arg by @YanisLalou in #258 - [MRG] Subsampling transformer by @rflamary in #259
- [MRG] Add MCC method by @tgnassou in #250
- [MRG] Fix callback issue in CAN by @YanisLalou in #265
- [MRG] fix
predict_proba
for deep method by @tgnassou in #247 - Batchnormfix2 by @antoinedemathelin in #266
- [MRG] Handle scalar sample domain by @antoinecollas in #267
- [MRG] Add
DomainAndLabelStratifiedSubsampleTransformer
+ FixDomainStratifiedSubsampleTransformer
by @YanisLalou in #268 - [MRG] Check if sample_domain have only unique domains indexes in check_*_domain by @apmellot in #261
- [MRG] Add epsilon in MCC to prevent log(0) by @YanisLalou in #270
- [MRG] Handle edge case for DAN by @YanisLalou in #271
- [MRG] Handle edge cases for CAN by @YanisLalou in #269
- [MRG] Add MDD method by @ambroiseodt in #263
- [MRG] Fix dissimilarities computations of Deep CAN by @antoinecollas in #274
- [MRG] Remove redundant centroid computation in spherical k-means by @YanisLalou in #275
- [MRG] Fix mdd loss by @antoinecollas in #277
- [MRG] Apply label smoothing to stabilize MDD by @antoinecollas in #279
- [MRG] do not try to complete when X_source is empty by @antoinecollas in #280
- [MRG] Add SPA method by @tgnassou in #276
- [MRG] Add contributor guide by @tgnassou in #282
Full Changelog: 0.3.0...0.4.0
0.3.0
First release of SKADA!
The following algorithms are currently implemented.
Domain adaptation algorithms
- Sample reweighting methods (Gaussian [1], Discriminant [2], KLIEPReweight [3],
DensityRatio [4], TarS [21], KMMReweight [23]) - Sample mapping methods (CORAL [5], Optimal Transport DA OTDA [6], LinearMonge [7], LS-ConS [21])
- Subspace methods (SubspaceAlignment [8], TCA [9], Transfer Subspace Learning [27])
- Other methods (JDOT [10], DASVM [11], OT Label Propagation [28])
Any methods that can be cast as an adaptation of the input data can be used in one of two ways:
- a scikit-learn transformer (Adapter) which provides both a full Classifier/Regressor estimator
- or an
Adapter
that can be used in a DA pipeline withmake_da_pipeline
.
Refer to the examples below and visit the gallery for more details.
Deep learning domain adaptation algorithms
- Deep Correlation alignment (DeepCORAL [12])
- Deep joint distribution optimal (DeepJDOT [13])
- Divergence minimization (MMD/DAN [14])
- Adversarial/discriminator based DA (DANN [15], CDAN [16])
DA metrics
- Importance Weighted [17]
- Prediction entropy [18]
- Soft neighborhood density [19]
- Deep Embedded Validation (DEV) [20]
- Circular Validation [11]
References
[1] Shimodaira Hidetoshi. "Improving predictive inference under covariate shift by weighting the log-likelihood function." Journal of statistical planning and inference 90, no. 2 (2000): 227-244.
[2] Sugiyama Masashi, Taiji Suzuki, and Takafumi Kanamori. "Density-ratio matching under the Bregman divergence: a unified framework of density-ratio estimation." Annals of the Institute of Statistical Mathematics 64 (2012): 1009-1044.
[3] Sugiyama Masashi, Taiji Suzuki, Shinichi Nakajima, Hisashi Kashima, Paul Von Bünau, and Motoaki Kawanabe. "Direct importance estimation for covariate shift adaptation." Annals of the Institute of Statistical Mathematics 60 (2008): 699-746.
[4] Sugiyama Masashi, and Klaus-Robert Müller. "Input-dependent estimation of generalization error under covariate shift." (2005): 249-279.
[5] Sun Baochen, Jiashi Feng, and Kate Saenko. "Correlation alignment for unsupervised domain adaptation." Domain adaptation in computer vision applications (2017): 153-171.
[6] Courty Nicolas, Flamary Rémi, Tuia Devis, and Alain Rakotomamonjy. "Optimal transport for domain adaptation." IEEE Trans. Pattern Anal. Mach. Intell 1, no. 1-40 (2016): 2.
[7] Flamary, R., Lounici, K., & Ferrari, A. (2019). Concentration bounds for linear monge mapping estimation and optimal transport domain adaptation. arXiv preprint arXiv:1905.10155.
[8] Fernando, B., Habrard, A., Sebban, M., & Tuytelaars, T. (2013). Unsupervised visual domain adaptation using subspace alignment. In Proceedings of the IEEE international conference on computer vision (pp. 2960-2967).
[9] Pan, S. J., Tsang, I. W., Kwok, J. T., & Yang, Q. (2010). Domain adaptation via transfer component analysis. IEEE transactions on neural networks, 22(2), 199-210.
[10] Courty, N., Flamary, R., Habrard, A., & Rakotomamonjy, A. (2017). Joint distribution optimal transportation for domain adaptation. Advances in neural information processing systems, 30.
[11] Bruzzone, L., & Marconcini, M. (2009). Domain adaptation problems: A DASVM classification technique and a circular validation strategy. IEEE transactions on pattern analysis and machine intelligence, 32(5), 770-787.
[12] Sun, B., & Saenko, K. (2016). Deep coral: Correlation alignment for deep domain adaptation. In Computer Vision–ECCV 2016 Workshops: Amsterdam, The Netherlands, October 8-10 and 15-16, 2016, Proceedings, Part III 14 (pp. 443-450). Springer International Publishing.
[13] Damodaran, B. B., Kellenberger, B., Flamary, R., Tuia, D., & Courty, N. (2018). Deepjdot: Deep joint distribution optimal transport for unsupervised domain adaptation. In Proceedings of the European conference on computer vision (ECCV) (pp. 447-463).
[14] Long, M., Cao, Y., Wang, J., & Jordan, M. (2015, June). Learning transferable features with deep adaptation networks. In International conference on machine learning (pp. 97-105). PMLR.
[15] Ganin, Y., Ustinova, E., Ajakan, H., Germain, P., Larochelle, H., Laviolette, F., ... & Lempitsky, V. (2016). Domain-adversarial training of neural networks. Journal of machine learning research, 17(59), 1-35.
[16] Long, M., Cao, Z., Wang, J., & Jordan, M. I. (2018). Conditional adversarial domain adaptation. Advances in neural information processing systems, 31.
[17] Sugiyama, M., Krauledat, M., & Müller, K. R. (2007). Covariate shift adaptation by importance weighted cross validation. Journal of Machine Learning Research, 8(5).
[18] Morerio, P., Cavazza, J., & Murino, V. (2017). Minimal-entropy correlation alignment for unsupervised deep domain adaptation. arXiv preprint arXiv:1711.10288.
[19] Saito, K., Kim, D., Teterwak, P., Sclaroff, S., Darrell, T., & Saenko, K. (2021). Tune it the right way: Unsupervised validation of domain adaptation via soft neighborhood density. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 9184-9193).
[20] You, K., Wang, X., Long, M., & Jordan, M. (2019, May). Towards accurate model selection in deep unsupervised domain adaptation. In International Conference on Machine Learning (pp. 7124-7133). PMLR.
[21] Zhang, K., Schölkopf, B., Muandet, K., Wang, Z. (2013). Domain Adaptation under Target and Conditional Shift. In International Conference on Machine Learning (pp. 819-827). PMLR.
[22] Loog, M. (2012). Nearest neighbor-based importance weighting. In 2012 IEEE International Workshop on Machine Learning for Signal Processing, pages 1–6. IEEE (https://arxiv.org/pdf/2102.02291.pdf)
[23] Domain Adaptation Problems: A DASVM ClassificationTechnique and a Circular Validation StrategyLorenzo Bruzzone, Fellow, IEEE, and Mattia Marconcini, Member, IEEE (https://rslab.disi.unitn.it/papers/R82-PAMI.pdf)
[24] Loog, M. (2012). Nearest neighbor-based importance weighting. In 2012 IEEE International Workshop on Machine Learning for Signal Processing, pages 1–6. IEEE (https://arxiv.org/pdf/2102.02291.pdf)
[25] J. Huang, A. Gretton, K. Borgwardt, B. Schölkopf and A. J. Smola. Correcting sample selection bias by unlabeled data. In NIPS, 2007. (https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=07117994f0971b2fc2df95adb373c31c3d313442)
[26] Long, M., Wang, J., Ding, G., Sun, J., and Yu, P. (2014). Transfer joint matching for unsupervised domain adaptation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1410–1417
[27] S. Si, D. Tao and B. Geng. In IEEE Transactions on Knowledge and Data Engineering, (2010) Bregman Divergence-Based Regularization for Transfer Subspace Learning
[28] Solomon, J., Rustamov, R., Guibas, L., & Butscher, A. (2014, January). Wasserstein propagation for semi-supervised learning. In International Conference on Machine Learning (pp. 306-314). PMLR.
What's Changed
- Update previously used dataset fixture by @kachayev in #117
- Remove masked inputs only if estimator does not accept
sample_domain
by @kachayev in #123 - Fix DiscriminatorReweightDensity and ReweightDensity by @antoinedemathelin in #118
- [TO_REVIEW] _find_y_type return enum by @YanisLalou in #125
- [MRG] Implement CircularValidation as a scorer by @YanisLalou in #124
- No need for mark-as-final operation by @kachayev in #128
- [MRG] Add TarS method by @antoinecollas in #93
- Selector to avoid filtering out masked samples when fitting transformer by @kachayev in https://github.com/sciki...
0.2.3
0.2.1
0.2
This is a first tag for SKADA. The library is still under heavy development and should not be used in production. API will definitely change in the future.
What's Changed
- Fix merge conflic in losses module by @kachayev in #23
- Follow github block-quote markdown syntax for the README by @kachayev in #22
- [MRG] CirleCI documentation by @rflamary in #26
- [MRG] Doc circleCI by @rflamary in #30
- Pipeline to respect
default_selector
parameters by @kachayev in #29 - [MRG] Debug examples by @rflamary in #31
- The selector to pass the params to the base estimator by @kachayev in #35
- Add code cells markup for dataset examples to make them interactive by @kachayev in #39
- Fix label masking for training dataset by @kachayev in #40
make_da_pipeline
helper to allow named estimators by @kachayev in #37- Rename
pack_flatten
topack_for_lodo
by @kachayev in #27 - [DOC] Corrections README and string for DomainAware dataset by @YanisLalou in #42
- Rename Bunch keys by @YanisLalou in #44
- [DOC] Small README fix by @YanisLalou in #52
- [WIP] Add testing with minimal install and update it to use pip by @rflamary in #48
- [FIX] Switch NotImplementedError to ValueError + New test cases by @YanisLalou in #50
- Flake8 correction for _samples_generator.py by @YanisLalou in #54
- Additional flake8 fixes by @kachayev in #55
- Remove version for POT by @kachayev in #57
- DomainAwareDataset str repr edge case handling by @YanisLalou in #49
- Switch from dev0 to stable sklearn 1.4.0 by @kachayev in #60
- [FIX] Unwrap expliticly given selector before generating the name for the pipeline by @YanisLalou in #51
- [MRG] Make sure all API methods accept sample_domain as None by @YanisLalou in #53
- Fix flake8 errors by @kachayev in #62
- Doc fix by @YanisLalou in #66
- [MRG] Add test cases for the Reweight class by @YanisLalou in #70
- [TO_REVIEW] Using global variables instead of number by @YanisLalou in #68
- [TO_REVIEW] Switch allow_source to True by default by @YanisLalou in #64
- [Fix] Update flake8.yaml to actually run! by @rflamary in #72
- Fix flake8 for utils and tests by @kachayev in #73
- [MRG] Regression label for 2d classification data generation by @BuenoRuben in #69
- [WIP] Update test suite for base selector functionality by @kachayev in #74
- Fix masked inputs filtering in the base selector for regression tasks by @kachayev in #86
- [MRG] JDOT Regressor by @rflamary in #76
- Properly process
sample_weight
when using reweight adapters by @kachayev in #90 - Remove target labels in the method comparison example by @antoinedemathelin in #92
- [MRG] Modification source_target_merge function behaviours by @YanisLalou in #71
- [MRG] PredictionEntropyScorer output negative scores by @YanisLalou in #63
New Contributors
- @kachayev made their first contribution in #23
- @rflamary made their first contribution in #26
- @YanisLalou made their first contribution in #42
- @BuenoRuben made their first contribution in #69
- @antoinedemathelin made their first contribution in #92
Full Changelog: https://github.com/scikit-adaptation/skada/commits/0.2