Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add slice_with_offset and dry_run Support for Tar Dataset Creation; New Script for Partial Conversion #10511

Closed
wants to merge 18 commits into from

Conversation

ssh-meister
Copy link
Collaborator

@ssh-meister ssh-meister commented Sep 17, 2024

What does this PR do ?

  1. slice_with_offset Feature:
  • Added support to create tar datasets from audio sample segments when the slice_by_offset flag is enabled and the manifest contains an offset field. This feature ensures that audio segments are correctly sliced according to specified offsets.
  1. dry_run Flag:
  • Introduced a dry_run flag that allows for the creation of sharded manifests without generating the actual tar files. This is useful for testing and validation purposes, allowing users to review manifest files before committing to the full dataset creation process.
  1. New Script - partial_conversion_to_tarred_audio_dataset.py:
  • Added a new script to handle the creation of individual shards from manifests produced in dry run mode. This script facilitates the conversion of partial datasets into fully tarred datasets based on previously generated manifests.

Changelog

  • Functions:
    -- create_shards: Updated to support the slice_by_offset flag for segment-based tar creation and dry_run functionality.
    -- Added new functionality to handle sharded manifests when dry_run is enabled.
  • New Script:
    -- partial_conversion_to_tarred_audio_dataset.py: Processes dry-run manifests to generate individual shards.

PR Type:

  • [v] New Feature

Additional Information

  • These changes enhance flexibility and control over the tar dataset creation process, allowing for more granular and controlled dataset management.

karpnv
karpnv previously approved these changes Sep 18, 2024
Copy link
Collaborator

@karpnv karpnv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

encoded_audio = BytesIO()
if codec == "opus":
kwargs = {"format": "ogg", "subtype": "opus"}
if codec is not None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually we could do a fast-path here: if codec=None and offset=0 and duration=None, instead of reading+writing with soundfile (triggering encoding and decoding) just read the raw audio file bytes and write them to tar file directly. Will likely speed up data tarring by a non-trivial factor. Up to you.

def _write_to_tar(
self, tar, audio_filepath: str, squashed_filename: str, duration: float = None, offset: float = 0
) -> None:
if ((codec := self.config.force_codec) is None or audio_filepath.endswith(f".{codec}")) and not duration:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are so many if/else here now that I think we should have some unit test coverage for this script (or just the function).

),
)
parser.add_argument(
"--dry_run",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dry_run is not an accurate name since the script will create some output: the manifests without tar files. I suggest renaming it to --only_manifest instead.

BTW dry_run would also be a nice option, telling you very quickly how many shards with how much data per shard are going to be created without actually reading any audio or writing anything.

"""
# Partial Tarred Audio Dataset Creator

## Overview
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand the purpose of this script. Why would you want to only tar specific shards? Can you add some comments / docs about the intended use cases?

Copy link
Contributor

This PR is stale because it has been open for 14 days with no activity. Remove stale label or comment or update or this will be closed in 7 days.

@github-actions github-actions bot added the stale label Oct 14, 2024
Copy link
Contributor

This PR was closed because it has been inactive for 7 days since being marked as stale.

@github-actions github-actions bot closed this Oct 21, 2024
@karpnv karpnv reopened this Nov 21, 2024
@github-actions github-actions bot removed the stale label Nov 22, 2024
@karpnv karpnv changed the title Add slice_by_offset and dry_run Support for Tar Dataset Creation; New Script for Partial Conversion Add slice_with_offset and dry_run Support for Tar Dataset Creation; New Script for Partial Conversion Nov 22, 2024
Copy link
Contributor

github-actions bot commented Dec 7, 2024

This PR is stale because it has been open for 14 days with no activity. Remove stale label or comment or update or this will be closed in 7 days.

@github-actions github-actions bot added the stale label Dec 7, 2024
Copy link
Contributor

This PR was closed because it has been inactive for 7 days since being marked as stale.

@github-actions github-actions bot closed this Dec 15, 2024
@NVIDIA NVIDIA deleted a comment from Jorjeous Dec 17, 2024
@karpnv karpnv reopened this Dec 27, 2024
Copy link
Contributor

beep boop 🤖: 🙏 The following files have warnings. In case you are familiar with these, please try helping us to improve the code base.


Your code was analyzed with PyLint. The following annotations have been identified:

************* Module partial_convertion_to_tarred_audio_dataset
scripts/speech_recognition/partial_convertion_to_tarred_audio_dataset.py:32:0: C0301: Line too long (292/100) (line-too-long)
scripts/speech_recognition/partial_convertion_to_tarred_audio_dataset.py:36:0: C0301: Line too long (158/100) (line-too-long)
scripts/speech_recognition/partial_convertion_to_tarred_audio_dataset.py:46:0: C0301: Line too long (130/100) (line-too-long)
scripts/speech_recognition/partial_convertion_to_tarred_audio_dataset.py:68:0: C0301: Line too long (110/100) (line-too-long)
scripts/speech_recognition/partial_convertion_to_tarred_audio_dataset.py:72:0: C0301: Line too long (108/100) (line-too-long)
scripts/speech_recognition/partial_convertion_to_tarred_audio_dataset.py:75:0: C0301: Line too long (136/100) (line-too-long)
scripts/speech_recognition/partial_convertion_to_tarred_audio_dataset.py:98:0: C0301: Line too long (154/100) (line-too-long)
scripts/speech_recognition/partial_convertion_to_tarred_audio_dataset.py:136:0: C0301: Line too long (112/100) (line-too-long)
scripts/speech_recognition/partial_convertion_to_tarred_audio_dataset.py:148:0: C0301: Line too long (109/100) (line-too-long)
scripts/speech_recognition/partial_convertion_to_tarred_audio_dataset.py:152:0: C0301: Line too long (136/100) (line-too-long)
scripts/speech_recognition/partial_convertion_to_tarred_audio_dataset.py:156:0: C0301: Line too long (116/100) (line-too-long)
scripts/speech_recognition/partial_convertion_to_tarred_audio_dataset.py:163:0: C0301: Line too long (138/100) (line-too-long)
scripts/speech_recognition/partial_convertion_to_tarred_audio_dataset.py:169:0: C0301: Line too long (110/100) (line-too-long)
scripts/speech_recognition/partial_convertion_to_tarred_audio_dataset.py:1:0: C0114: Missing module docstring (missing-module-docstring)
scripts/speech_recognition/partial_convertion_to_tarred_audio_dataset.py:20:0: E0401: Unable to import 'hydra' (import-error)
scripts/speech_recognition/partial_convertion_to_tarred_audio_dataset.py:22:0: E0401: Unable to import 'hydra.core.config_store' (import-error)
scripts/speech_recognition/partial_convertion_to_tarred_audio_dataset.py:23:0: E0401: Unable to import 'joblib' (import-error)
scripts/speech_recognition/partial_convertion_to_tarred_audio_dataset.py:24:0: E0401: Unable to import 'omegaconf' (import-error)
scripts/speech_recognition/partial_convertion_to_tarred_audio_dataset.py:25:0: E0401: Unable to import 'tqdm' (import-error)
scripts/speech_recognition/partial_convertion_to_tarred_audio_dataset.py:27:0: W0105: String statement has no effect (pointless-string-statement)
scripts/speech_recognition/partial_convertion_to_tarred_audio_dataset.py:88:9: W1514: Using open without explicitly specifying an encoding (unspecified-encoding)
scripts/speech_recognition/partial_convertion_to_tarred_audio_dataset.py:161:4: R1720: Unnecessary "else" after "raise", remove the "else" and de-indent the code inside it (no-else-raise)
scripts/speech_recognition/partial_convertion_to_tarred_audio_dataset.py:178:20: W0212: Access to a protected member _create_shard of a client class (protected-access)
scripts/speech_recognition/partial_convertion_to_tarred_audio_dataset.py:183:0: C0206: Consider iterating with .items() (consider-using-dict-items)
scripts/speech_recognition/partial_convertion_to_tarred_audio_dataset.py:188:0: C0116: Missing function or method docstring (missing-function-docstring)
scripts/speech_recognition/partial_convertion_to_tarred_audio_dataset.py:195:4: E1120: No value for argument 'cfg' in function call (no-value-for-parameter)
************* Module convert_to_tarred_audio_dataset
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:24:0: C0301: Line too long (111/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:28:0: C0301: Line too long (128/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:30:0: C0301: Line too long (114/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:61:0: C0301: Line too long (103/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:129:0: C0301: Line too long (110/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:169:0: C0301: Line too long (108/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:185:0: C0301: Line too long (129/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:186:0: C0301: Line too long (138/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:187:0: C0301: Line too long (110/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:188:0: C0301: Line too long (117/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:189:0: C0301: Line too long (119/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:196:0: C0301: Line too long (108/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:197:0: C0301: Line too long (105/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:201:0: C0301: Line too long (125/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:206:0: C0301: Line too long (112/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:217:0: C0301: Line too long (113/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:220:0: C0301: Line too long (115/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:222:0: C0301: Line too long (117/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:271:0: C0301: Line too long (112/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:282:0: C0301: Line too long (106/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:321:0: C0301: Line too long (106/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:331:0: C0301: Line too long (106/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:337:0: C0301: Line too long (105/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:360:0: C0301: Line too long (111/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:362:0: C0301: Line too long (113/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:364:0: C0301: Line too long (106/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:365:0: C0301: Line too long (142/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:366:0: C0301: Line too long (115/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:367:0: C0301: Line too long (119/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:370:0: C0301: Line too long (112/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:373:0: C0301: Line too long (121/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:374:0: C0301: Line too long (110/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:377:0: C0301: Line too long (116/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:379:0: C0301: Line too long (123/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:380:0: C0301: Line too long (123/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:411:0: C0301: Line too long (103/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:417:0: C0301: Line too long (107/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:421:0: C0301: Line too long (121/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:441:0: C0301: Line too long (106/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:460:0: C0301: Line too long (115/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:474:0: C0301: Line too long (115/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:484:0: C0301: Line too long (106/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:502:0: C0301: Line too long (105/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:552:0: C0301: Line too long (120/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:555:0: C0301: Line too long (117/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:558:0: C0301: Line too long (101/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:560:0: C0301: Line too long (108/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:574:0: C0301: Line too long (105/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:576:0: C0301: Line too long (113/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:603:0: C0301: Line too long (111/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:616:0: C0301: Line too long (112/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:640:0: C0301: Line too long (105/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:692:0: C0301: Line too long (115/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:696:0: C0301: Line too long (144/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:822:0: C0301: Line too long (109/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:825:0: C0301: Line too long (114/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:834:0: C0301: Line too long (103/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:842:0: C0301: Line too long (124/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:864:0: C0301: Line too long (112/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:870:0: C0301: Line too long (112/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:875:0: C0301: Line too long (103/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:881:0: C0301: Line too long (136/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:905:0: C0301: Line too long (124/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:909:0: C0301: Line too long (118/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:938:0: C0301: Line too long (128/100) (line-too-long)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:90:0: E0401: Unable to import 'numpy' (import-error)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:91:0: E0401: Unable to import 'soundfile' (import-error)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:92:0: E0401: Unable to import 'joblib' (import-error)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:93:0: E0401: Unable to import 'omegaconf' (import-error)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:94:0: E0401: Unable to import 'tqdm' (import-error)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:105:0: C0115: Missing class docstring (missing-class-docstring)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:105:0: R0902: Too many instance attributes (14/7) (too-many-instance-attributes)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:123:0: C0115: Missing class docstring (missing-class-docstring)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:129:77: W0108: Lambda may not be necessary (unnecessary-lambda)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:135:4: C0116: Missing function or method docstring (missing-function-docstring)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:139:4: C0116: Missing function or method docstring (missing-function-docstring)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:145:4: C0116: Missing function or method docstring (missing-function-docstring)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:171:4: R0913: Too many arguments (7/5) (too-many-arguments)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:171:4: R0917: Too many positional arguments (7/5) (too-many-positional-arguments)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:171:4: R0914: Too many local variables (41/15) (too-many-locals)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:171:4: R0912: Too many branches (20/12) (too-many-branches)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:171:4: R0915: Too many statements (69/50) (too-many-statements)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:321:4: C0116: Missing function or method docstring (missing-function-docstring)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:322:8: E0401: Unable to import 'lhotse' (import-error)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:322:8: C0415: Import outside toplevel (lhotse.CutSet) (import-outside-toplevel)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:323:8: E0401: Unable to import 'lhotse.dataset.sampling.dynamic_bucketing' (import-error)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:323:8: C0415: Import outside toplevel (lhotse.dataset.sampling.dynamic_bucketing.estimate_duration_buckets) (import-outside-toplevel)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:325:8: E0401: Unable to import 'nemo.collections.common.data.lhotse.nemo_adapters' (import-error)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:325:8: C0415: Import outside toplevel (nemo.collections.common.data.lhotse.nemo_adapters.LazyNeMoIterator) (import-outside-toplevel)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:340:15: R1735: Consider using '{"use_lhotse": True, "use_bucketing": True, "num_buckets": num_buckets, ... }' instead of a call to 'dict'. (use-dict-literal)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:347:4: R0913: Too many arguments (7/5) (too-many-arguments)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:347:4: R0917: Too many positional arguments (7/5) (too-many-positional-arguments)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:347:4: R0914: Too many local variables (41/15) (too-many-locals)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:410:8: C0200: Consider using enumerate instead of iterating with range and len (consider-using-enumerate)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:347:4: R0912: Too many branches (18/12) (too-many-branches)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:347:4: R0915: Too many statements (86/50) (too-many-statements)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:573:4: R0913: Too many arguments (6/5) (too-many-arguments)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:573:4: R0917: Too many positional arguments (6/5) (too-many-positional-arguments)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:603:4: R0913: Too many arguments (6/5) (too-many-arguments)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:603:4: R0917: Too many positional arguments (6/5) (too-many-positional-arguments)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:603:4: R0914: Too many local variables (17/15) (too-many-locals)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:614:16: R1735: Consider using '{}' instead of a call to 'dict'. (use-dict-literal)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:603:4: R0912: Too many branches (15/12) (too-many-branches)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:612:18: R1732: Consider using 'with' for resource-allocating operations (consider-using-with)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:672:4: C0116: Missing function or method docstring (missing-function-docstring)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:684:0: C0116: Missing function or method docstring (missing-function-docstring)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:684:9: W0621: Redefining name 'args' from outer scope (line 942) (redefined-outer-name)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:705:0: C0116: Missing function or method docstring (missing-function-docstring)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:705:0: R0913: Too many arguments (19/5) (too-many-arguments)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:705:0: R0917: Too many positional arguments (19/5) (too-many-positional-arguments)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:705:0: R0914: Too many local variables (27/15) (too-many-locals)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:728:22: R1719: The if expression can be replaced with 'not test' (simplifiable-if-expression)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:749:8: R1722: Consider using 'sys.exit' instead (consider-using-sys-exit)
scripts/speech_recognition/convert_to_tarred_audio_dataset.py:804:8: E1123: Unexpected keyword argument 'slice_with_offset' in method call (unexpected-keyword-arg)

-----------------------------------
Your code has been rated at 5.75/10

Mitigation guide:

  • Add sensible and useful docstrings to functions and methods
  • For trivial methods like getter/setters, consider adding # pylint: disable=C0116 inside the function itself
  • To disable multiple functions/methods at once, put a # pylint: disable=C0116 before the first and a # pylint: enable=C0116 after the last.

By applying these rules, we reduce the occurance of this message in future.

Thank you for improving NeMo's documentation!

@github-actions github-actions bot removed the stale label Dec 28, 2024
Copy link
Contributor

This PR is stale because it has been open for 14 days with no activity. Remove stale label or comment or update or this will be closed in 7 days.

@github-actions github-actions bot added the stale label Jan 11, 2025
Copy link
Contributor

This PR was closed because it has been inactive for 7 days since being marked as stale.

@github-actions github-actions bot closed this Jan 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants