Releases: xhluca/dl-translate
Releases · xhluca/dl-translate
dl-translate v0.2.0
Add m2m100 as the new default model to support 100 languages
Added
dlt.lang.m2m100
module: Now has variables for over 100 languages, also auto-complete ready. Example:dlt.lang.m2m100.ENGLISH
.dlt.utils.available_languages
,dlt.utils.available_codes
: Now supports argument "m2m100"- Available languages for each model family
- Script and template to generate available languages
Changed
- [BREAKING]
dlt.lang.TranslationModel
: A new model parameter calledmodel_family
in the initialization function. Either "mbart50" or "m2m100". By default, it will be inferred based onmodel_or_path
. Needs to be explicitly set ifmodel_or_path
is a path. - [BREAKING] Default model changed to m2m100
- Docs and readme about mbart50 were reframed to take into account the new model
dlt.TranslationModel.translate
: Improved docstring to be more general.- Tests pertaining to
m2m100
scripts/generate_langs.py
: Renamed, mechanism now changed to loading from json filesdocs/index.md
: Expand the "Usage" and "Advanced" sectionsREADME.md
: Add acknowledgement about m2m100, significantly trim "Advanced" section, make "Usage" more concise
Fixed
dlt.TranslationModel.available_codes()
was returning the languages instead of the codes. It will now correctly return the code.
Removed
- Output type hints for
TranslationModel.get_transformers_model
andTranslationModel.get_tokenizer
- [BREAKING]
dlt.TranslationModel.bart_model
anddlt.TranslationModel.tokenizer
are no longer available to be used directly. Please usedlt.TranslationModel.get_transformers_model
anddlt.TranslationModel.get_tokenizer
instead.
dl-translate v0.2.0rc1
Add m2m100 as an alternative to mbart50
m2m100 has more languages available (~110) and has also reported their absolute BLEU scores.
Added
dlt.lang.m2m100
module: Now has variables for over 100 languages, also auto-complete ready. Example:dlt.lang.m2m100.ENGLISH
.dlt.utils.available_languages
,dlt.utils.available_codes
: Now supports argument "m2m100"
Changed
- [BREAKING]
dlt.lang.TranslationModel
: A new model parameter calledmodel_family
in the initialization function. Either "mbart50" or "m2m100". By default, it will be inferred based onmodel_or_path
. Needs to be explicitly set ifmodel_or_path
is a path. dlt.TranslationModel.translate
: Improved docstring to be more general.- Tests pertaining to
m2m100
scripts/generate_langs.py
: Renamed, mechanism now changed to loading from json files
Fixed
dlt.TranslationModel.available_codes()
was returning the languages instead of the codes. It will now correctly return the code.
Removed
- Output type hints for
TranslationModel.get_transformers_model
andTranslationModel.get_tokenizer
- [BREAKING]
dlt.TranslationModel.bart_model
anddlt.TranslationModel.tokenizer
are no longer available to be used directly. Please usedlt.TranslationModel.get_transformers_model
anddlt.TranslationModel.get_tokenizer
instead.
dl-translate v0.1.0
Initial Release
This is the initial release of, dl-translate
, a deep learning-based translation library built on Huggingface transformers
and Facebook's mBART-Large
. To install, run:
pip install dl-translate
Check out the user guide to get started, or use of the following links:
💻 GitHub Repository
📚 Documentation / Readthedocs
🐍 PyPi project
🧪 Colab Demo / Kaggle Demo