Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flashlight and Pyctcdecode decoders #8428

Open
wants to merge 95 commits into
base: main
Choose a base branch
from
Open

Flashlight and Pyctcdecode decoders #8428

wants to merge 95 commits into from

Conversation

karpnv
Copy link
Collaborator

@karpnv karpnv commented Feb 15, 2024

Preserve Flashlight and Pyctcdecode beamsearch with Ngram LM

Support Flashlight and Pyctcdecode decoding with pure KenLM and NeMo KenLM
Standardize API of CLI inference scripts

Collection: ASR

Changelog

  • Fix install script install_beamsearch_decoders.sh
  • Create flashlight_lexicon file during scripts/asr_language_modeling/ngram_lm/train_kenlm.py and tar it with kenlm.bin
  • Unify parameters for eval_beamsearch_ngram_ctc.py, speech_to_text_eval.py and training
    -- Get logprobs from Hypothesis
    -- Use "pyctcdecode" strategy as default beamsearch algorithm denoted as "beam"
    -- Remove default seq2seq strategy
    -- Check decoding_type and search_type combinations
    -- Support empty string in nemo_kenlm_path and word_kenlm_path for beamsearch without LM (ZeroLM)
  • Fix bug with EncDecHybridRNNTCTCModel in examples/asr/transcribe_speech.py
  • Support AggregateTokenizer in scripts/asr_language_modeling/ngram_lm/create_lexicon_from_arpa.py
python3 scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram_ctc.py \
model_path=am_model.nemo  \
dataset_manifest=manifest.json  \
preds_output_folder=/tmp   \
ctc_decoding.strategy=flashlight \
ctc_decoding.beam.kenlm_path=am_model.kenlm \
ctc_decoding.beam.beam_size=[4]   \
ctc_decoding.beam.beam_alpha=[0.5]   \
ctc_decoding.beam.beam_beta=[0.5] \
batch_size=32  \
beam_batch_size=1 \
cuda=1

python3 examples/asr/speech_to_text_eval.py  \
model_path=am_model.nemo \ 
dataset_manifest=manifest.json \
decoder_type=ctc  
ctc_decoding.strategy=flashlight \  
ctc_decoding.beam.nemo_kenlm_path=kenlm_model.bin \
ctc_decoding.beam.beam_size=4   \
ctc_decoding.beam.beam_alpha=0.5   \
ctc_decoding.beam.beam_beta=0.5 \
ctc_decoding.beam.flashlight_cfg.lexicon_path=am_model.flashlight_lexicon \ # DEFAULT_TOKEN_OFFSET
ctc_decoding.beam.return_best_hypothesis=true \
batch_size=32  \
output_filename=/tmp/manifest_out.json 
cuda=1

PR Type:

  • [ V] New Feature
  • Bugfix
  • Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Additional Information

karpnv and others added 25 commits January 24, 2024 00:26
Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Nikolay Karpov <[email protected]>
@github-actions github-actions bot added the ASR label Feb 15, 2024
Copy link
Contributor

github-actions bot commented Mar 1, 2024

This PR is stale because it has been open for 14 days with no activity. Remove stale label or comment or update or this will be closed in 7 days.

@github-actions github-actions bot added the stale label Mar 1, 2024
Copy link
Contributor

github-actions bot commented Mar 9, 2024

This PR was closed because it has been inactive for 7 days since being marked as stale.

@github-actions github-actions bot closed this Mar 9, 2024
@github-actions github-actions bot removed the stale label Oct 18, 2024
Copy link
Contributor

github-actions bot commented Nov 1, 2024

This PR is stale because it has been open for 14 days with no activity. Remove stale label or comment or update or this will be closed in 7 days.

@github-actions github-actions bot added the stale label Nov 1, 2024
@tbartley94 tbartley94 removed the stale label Nov 4, 2024
@tbartley94
Copy link
Collaborator

@karpnv could you fix merge conflicts so this can be merged?

Signed-off-by: Nikolay Karpov <[email protected]>
@karpnv karpnv dismissed stale reviews from andrusenkoau and artbataev via 4f4212c November 7, 2024 14:24
@karpnv karpnv added Run CICD and removed Run CICD labels Nov 7, 2024
lexicon_path = os.path.join(tmpdir.name, lexicon[0].name)
SaveRestoreConnector._unpack_nemo_file(path2file=kenlm_path, out_folder=tmpdir.name, members=members)
cfg = OmegaConf.load(config_path)
return tmpdir, cfg.encoding_level, kenlm_model_path, lexicon_path

Check failure

Code scanning / CodeQL

Potentially uninitialized local variable Error

Local variable 'lexicon_path' may be used before it is initialized.
try:
self.tmpdir, self.kenlm_encoding_level, self.kenlm_path, lexicon_path = get_nemolm(kenlm_path)
if not self.flashlight_cfg.lexicon_path:
self.flashlight_cfg.lexicon_path = lexicon_path

Check failure

Code scanning / CodeQL

Potentially uninitialized local variable Error

Local variable 'lexicon_path' may be used before it is initialized.
Copy link
Contributor

This PR is stale because it has been open for 14 days with no activity. Remove stale label or comment or update or this will be closed in 7 days.

@tbartley94
Copy link
Collaborator

@karpnv is this going to be completed or should we close it?

Copy link
Contributor

This PR is stale because it has been open for 14 days with no activity. Remove stale label or comment or update or this will be closed in 7 days.

Copy link
Contributor

This PR is stale because it has been open for 14 days with no activity. Remove stale label or comment or update or this will be closed in 7 days.

Copy link
Contributor

This PR is stale because it has been open for 14 days with no activity. Remove stale label or comment or update or this will be closed in 7 days.

@github-actions github-actions bot added the stale label Jan 10, 2025
Copy link
Contributor

This PR was closed because it has been inactive for 7 days since being marked as stale.

@github-actions github-actions bot closed this Jan 18, 2025
@tbartley94 tbartley94 removed the stale label Jan 21, 2025
@tbartley94 tbartley94 reopened this Jan 21, 2025
@tbartley94
Copy link
Collaborator

@karpnv Can you finalize this PR?

Copy link
Contributor

beep boop 🤖: 🙏 The following files have warnings. In case you are familiar with these, please try helping us to improve the code base.


Your code was analyzed with PyLint. The following annotations have been identified:

************* Module speech_to_text_eval
examples/asr/speech_to_text_eval.py:86:0: C0115: Missing class docstring (missing-class-docstring)
examples/asr/speech_to_text_eval.py:113:0: C0116: Missing function or method docstring (missing-function-docstring)
************* Module transcribe_speech
examples/asr/transcribe_speech.py:174:0: C0301: Line too long (139/119) (line-too-long)
examples/asr/transcribe_speech.py:382:0: C0301: Line too long (134/119) (line-too-long)
examples/asr/transcribe_speech.py:111:0: C0115: Missing class docstring (missing-class-docstring)
examples/asr/transcribe_speech.py:118:0: C0115: Missing class docstring (missing-class-docstring)
examples/asr/transcribe_speech.py:210:0: C0116: Missing function or method docstring (missing-function-docstring)
examples/asr/transcribe_speech.py:15:0: W0611: Unused import contextlib (unused-import)
************* Module nemo.collections.asr.models.hybrid_rnnt_ctc_models
nemo/collections/asr/models/hybrid_rnnt_ctc_models.py:123:0: C0301: Line too long (266/119) (line-too-long)
nemo/collections/asr/models/hybrid_rnnt_ctc_models.py:224:0: C0301: Line too long (120/119) (line-too-long)
nemo/collections/asr/models/hybrid_rnnt_ctc_models.py:674:0: C0301: Line too long (141/119) (line-too-long)
nemo/collections/asr/models/hybrid_rnnt_ctc_models.py:675:0: C0301: Line too long (139/119) (line-too-long)
nemo/collections/asr/models/hybrid_rnnt_ctc_models.py:16:0: W0611: Unused import json (unused-import)
nemo/collections/asr/models/hybrid_rnnt_ctc_models.py:17:0: W0611: Unused import os (unused-import)
nemo/collections/asr/models/hybrid_rnnt_ctc_models.py:18:0: W0611: Unused import tempfile (unused-import)
nemo/collections/asr/models/hybrid_rnnt_ctc_models.py:24:0: W0611: Unused tqdm imported from tqdm.auto (unused-import)
************* Module nemo.collections.asr.modules.flashlight_decoder
nemo/collections/asr/modules/flashlight_decoder.py:49:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/asr/modules/flashlight_decoder.py:53:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/asr/modules/flashlight_decoder.py:57:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/asr/modules/flashlight_decoder.py:61:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/asr/modules/flashlight_decoder.py:65:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/asr/modules/flashlight_decoder.py:76:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/asr/modules/flashlight_decoder.py:83:0: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/asr/modules/flashlight_decoder.py:292:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/asr/modules/flashlight_decoder.py:17:0: W0611: Unused Iterable imported from typing (unused-import)
nemo/collections/asr/modules/flashlight_decoder.py:17:0: W0611: Unused Tuple imported from typing (unused-import)
nemo/collections/asr/modules/flashlight_decoder.py:23:0: W0611: Unused typecheck imported from nemo.core.classes (unused-import)
nemo/collections/asr/modules/flashlight_decoder.py:24:0: W0611: Unused LengthsType imported from nemo.core.neural_types (unused-import)
nemo/collections/asr/modules/flashlight_decoder.py:24:0: W0611: Unused LogprobsType imported from nemo.core.neural_types (unused-import)
nemo/collections/asr/modules/flashlight_decoder.py:24:0: W0611: Unused NeuralType imported from nemo.core.neural_types (unused-import)
nemo/collections/asr/modules/flashlight_decoder.py:24:0: W0611: Unused PredictionsType imported from nemo.core.neural_types (unused-import)
************* Module nemo.collections.asr.parts.submodules.ctc_beam_decoding
nemo/collections/asr/parts/submodules/ctc_beam_decoding.py:197:0: C0301: Line too long (123/119) (line-too-long)
nemo/collections/asr/parts/submodules/ctc_beam_decoding.py:38:0: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/asr/parts/submodules/ctc_beam_decoding.py:62:0: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/asr/parts/submodules/ctc_beam_decoding.py:210:0: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/asr/parts/submodules/ctc_beam_decoding.py:898:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/asr/parts/submodules/ctc_beam_decoding.py:910:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/asr/parts/submodules/ctc_beam_decoding.py:920:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/asr/parts/submodules/ctc_beam_decoding.py:937:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/asr/parts/submodules/ctc_beam_decoding.py:944:0: C0115: Missing class docstring (missing-class-docstring)
************* Module nemo.collections.asr.parts.submodules.ctc_decoding
nemo/collections/asr/parts/submodules/ctc_decoding.py:263:0: C0301: Line too long (209/119) (line-too-long)
nemo/collections/asr/parts/submodules/ctc_decoding.py:342:0: C0301: Line too long (136/119) (line-too-long)
nemo/collections/asr/parts/submodules/ctc_decoding.py:736:0: C0301: Line too long (151/119) (line-too-long)
nemo/collections/asr/parts/submodules/ctc_decoding.py:893:0: C0301: Line too long (125/119) (line-too-long)
nemo/collections/asr/parts/submodules/ctc_decoding.py:33:0: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/asr/parts/submodules/ctc_decoding.py:967:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/asr/parts/submodules/ctc_decoding.py:978:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/asr/parts/submodules/ctc_decoding.py:989:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/asr/parts/submodules/ctc_decoding.py:1449:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/asr/parts/submodules/ctc_decoding.py:1496:0: C0115: Missing class docstring (missing-class-docstring)
************* Module scripts.asr_language_modeling.ngram_lm.create_lexicon_from_arpa
scripts/asr_language_modeling/ngram_lm/create_lexicon_from_arpa.py:33:0: C0116: Missing function or method docstring (missing-function-docstring)
************* Module scripts.asr_language_modeling.ngram_lm.eval_beamsearch_ngram_ctc
scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram_ctc.py:19:0: C0301: Line too long (125/119) (line-too-long)
scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram_ctc.py:54:0: C0301: Line too long (141/119) (line-too-long)
scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram_ctc.py:55:0: C0301: Line too long (160/119) (line-too-long)
scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram_ctc.py:175:0: C0301: Line too long (121/119) (line-too-long)
scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram_ctc.py:205:0: C0301: Line too long (125/119) (line-too-long)
scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram_ctc.py:284:0: C0301: Line too long (120/119) (line-too-long)
scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram_ctc.py:371:0: C0301: Line too long (122/119) (line-too-long)
scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram_ctc.py:372:0: C0301: Line too long (148/119) (line-too-long)
scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram_ctc.py:422:0: C0301: Line too long (138/119) (line-too-long)
scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram_ctc.py:424:0: C0301: Line too long (157/119) (line-too-long)
scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram_ctc.py:135:0: C0116: Missing function or method docstring (missing-function-docstring)
scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram_ctc.py:283:0: C0116: Missing function or method docstring (missing-function-docstring)
************* Module scripts.asr_language_modeling.ngram_lm.eval_beamsearch_ngram_transducer
scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram_transducer.py:18:0: C0301: Line too long (129/119) (line-too-long)
scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram_transducer.py:108:0: C0301: Line too long (126/119) (line-too-long)
scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram_transducer.py:109:0: C0301: Line too long (140/119) (line-too-long)
scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram_transducer.py:111:0: C0301: Line too long (149/119) (line-too-long)
scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram_transducer.py:112:0: C0301: Line too long (162/119) (line-too-long)
scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram_transducer.py:118:0: C0301: Line too long (139/119) (line-too-long)
scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram_transducer.py:164:0: C0301: Line too long (245/119) (line-too-long)
scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram_transducer.py:225:0: C0301: Line too long (149/119) (line-too-long)
scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram_transducer.py:229:0: C0301: Line too long (132/119) (line-too-long)
scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram_transducer.py:232:0: C0301: Line too long (124/119) (line-too-long)
scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram_transducer.py:384:0: C0301: Line too long (121/119) (line-too-long)
scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram_transducer.py:124:0: C0116: Missing function or method docstring (missing-function-docstring)
scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram_transducer.py:240:0: C0116: Missing function or method docstring (missing-function-docstring)
************* Module scripts.asr_language_modeling.ngram_lm.kenlm_utils
scripts/asr_language_modeling/ngram_lm/kenlm_utils.py:20:0: C0301: Line too long (129/119) (line-too-long)
scripts/asr_language_modeling/ngram_lm/kenlm_utils.py:21:0: C0301: Line too long (124/119) (line-too-long)
scripts/asr_language_modeling/ngram_lm/kenlm_utils.py:80:0: C0301: Line too long (129/119) (line-too-long)
scripts/asr_language_modeling/ngram_lm/kenlm_utils.py:53:0: C0116: Missing function or method docstring (missing-function-docstring)
scripts/asr_language_modeling/ngram_lm/kenlm_utils.py:58:0: C0116: Missing function or method docstring (missing-function-docstring)
scripts/asr_language_modeling/ngram_lm/kenlm_utils.py:104:0: C0116: Missing function or method docstring (missing-function-docstring)
scripts/asr_language_modeling/ngram_lm/kenlm_utils.py:130:0: C0116: Missing function or method docstring (missing-function-docstring)
scripts/asr_language_modeling/ngram_lm/kenlm_utils.py:180:0: C0116: Missing function or method docstring (missing-function-docstring)
scripts/asr_language_modeling/ngram_lm/kenlm_utils.py:189:0: C0116: Missing function or method docstring (missing-function-docstring)
scripts/asr_language_modeling/ngram_lm/kenlm_utils.py:216:0: C0116: Missing function or method docstring (missing-function-docstring)
scripts/asr_language_modeling/ngram_lm/kenlm_utils.py:230:0: C0116: Missing function or method docstring (missing-function-docstring)
scripts/asr_language_modeling/ngram_lm/kenlm_utils.py:240:0: C0116: Missing function or method docstring (missing-function-docstring)
************* Module scripts.asr_language_modeling.ngram_lm.ngram_merge
scripts/asr_language_modeling/ngram_lm/ngram_merge.py:65:0: C0301: Line too long (122/119) (line-too-long)
scripts/asr_language_modeling/ngram_lm/ngram_merge.py:111:0: C0301: Line too long (395/119) (line-too-long)
scripts/asr_language_modeling/ngram_lm/ngram_merge.py:115:0: C0301: Line too long (201/119) (line-too-long)
scripts/asr_language_modeling/ngram_lm/ngram_merge.py:117:0: C0301: Line too long (161/119) (line-too-long)
scripts/asr_language_modeling/ngram_lm/ngram_merge.py:118:0: C0301: Line too long (216/119) (line-too-long)
scripts/asr_language_modeling/ngram_lm/ngram_merge.py:250:0: C0301: Line too long (156/119) (line-too-long)
scripts/asr_language_modeling/ngram_lm/ngram_merge.py:274:0: C0301: Line too long (147/119) (line-too-long)
scripts/asr_language_modeling/ngram_lm/ngram_merge.py:437:0: C0301: Line too long (140/119) (line-too-long)
scripts/asr_language_modeling/ngram_lm/ngram_merge.py:444:0: C0301: Line too long (144/119) (line-too-long)
scripts/asr_language_modeling/ngram_lm/ngram_merge.py:473:0: C0301: Line too long (129/119) (line-too-long)
scripts/asr_language_modeling/ngram_lm/ngram_merge.py:59:0: C0115: Missing class docstring (missing-class-docstring)
************* Module scripts.asr_language_modeling.ngram_lm.train_kenlm
scripts/asr_language_modeling/ngram_lm/train_kenlm.py:63:0: C0301: Line too long (133/119) (line-too-long)
scripts/asr_language_modeling/ngram_lm/train_kenlm.py:64:0: C0301: Line too long (128/119) (line-too-long)
scripts/asr_language_modeling/ngram_lm/train_kenlm.py:68:0: C0301: Line too long (169/119) (line-too-long)
scripts/asr_language_modeling/ngram_lm/train_kenlm.py:81:0: C0301: Line too long (128/119) (line-too-long)
scripts/asr_language_modeling/ngram_lm/train_kenlm.py:87:0: C0116: Missing function or method docstring (missing-function-docstring)

-----------------------------------
Your code has been rated at 9.57/10

Mitigation guide:

  • Add sensible and useful docstrings to functions and methods
  • For trivial methods like getter/setters, consider adding # pylint: disable=C0116 inside the function itself
  • To disable multiple functions/methods at once, put a # pylint: disable=C0116 before the first and a # pylint: enable=C0116 after the last.

By applying these rules, we reduce the occurance of this message in future.

Thank you for improving NeMo's documentation!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants