-
Notifications
You must be signed in to change notification settings - Fork 580
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement additional CLI repo functionality #1671
base: main
Are you sure you want to change the base?
Implement additional CLI repo functionality #1671
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. |
Thinking of this... maybe |
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #1671 +/- ##
==========================================
- Coverage 81.52% 81.21% -0.32%
==========================================
Files 62 63 +1
Lines 7198 7228 +30
==========================================
+ Hits 5868 5870 +2
- Misses 1330 1358 +28
☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @martinbrose thanks for taking this issue! I've made a high-level review focused on how we should build and think the CLI in my opinion.
- first, that was a good idea to take
create_repo
from the UsersCommand and create a new dedicated one for create/list/delete - now that we reboot the CLI, I think it's time to get rid of most of the legacy code and structure. In particular:
- no mention to local git whatsoever
- trying to clean the output so that it can be piped to another command
- parse the args and statically type them in
__init__
, when possible. No need to do it when it's not possible. - keep the logic as lightweight as possible. Ideally most of the logic must be in the methods we call (create_repo, delete_repo, list_models,...). The only remaining logic to implement is how to parse and forward the values to those methods.
- I also made some suggestion on the argument names (repo_id, --repo-type, --private, --token)... Let's try to be as consistent as possible with the other CLIs we are building (especially upload/download).
In general, please let me know if you have any questions or want to challenge the comments below. If you think some things can be improved in my suggestion, please let me know! Rebooting this CLI is a great opportunity to (re-)design it and do what we feel is best for the users, knowing the current state of huggingface_hub
. The previous iteration dates back ~2y ago so the expectation/usage were quite different! 🤗
@@ -36,6 +37,7 @@ def main(): | |||
LfsCommands.register_subcommand(commands_parser) | |||
ScanCacheCommand.register_subcommand(commands_parser) | |||
DeleteCacheCommand.register_subcommand(commands_parser) | |||
RepoCommands.register_subcommand(commands_parser) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you move this line up between UserCommand
and UploadCommand
? Looks more important than the lfs and cache commands IMO
default="model", | ||
help='Optional: type: set to "dataset" or "space" if creating a dataset or space, default is model.', | ||
) | ||
repo_create_parser.add_argument("--organization", type=str, help="Optional: organization namespace.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Organization is a bit of a legacy argument. Let's remove it as it is now possible to directly set a repo_id.
repo_create_parser.add_argument( | ||
"name", | ||
type=str, | ||
help="Name for your repo. Will be namespaced under your username to build the repo id.", | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would renamed it repo_id
for consistency with the create_repo
method.
repo_create_parser.add_argument( | ||
"--type", | ||
choices=["model", "dataset", "space"], | ||
default="model", | ||
help='Optional: type: set to "dataset" or "space" if creating a dataset or space, default is model.', | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rename it --repo-type
for consistency as well (same for other parsers)
repo_create_parser.add_argument( | ||
"--space_sdk", | ||
type=str, | ||
help='Optional: Hugging Face Spaces SDK type. Required when --type is set to "space".', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
help='Optional: Hugging Face Spaces SDK type. Required when --type is set to "space".', | |
help='Choice of SDK to use if --repo-type="space"', |
No need to say it's optional (--
args are basically all optional). Same for the other descriptions :)
print("\nYour repo now lives at:") | ||
print(f" {ANSI.bold(url)}") | ||
print("\nYou can clone it locally with the command below, and commit/push as usual.") | ||
print(f"\n git clone {url}") | ||
print("") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
print("\nYour repo now lives at:") | |
print(f" {ANSI.bold(url)}") | |
print("\nYou can clone it locally with the command below, and commit/push as usual.") | |
print(f"\n git clone {url}") | |
print("") | |
print("\nYour repo now lives at:") | |
print(f" {ANSI.bold(url)}") | |
print("\nYou can clone it locally with the command below, and commit/push as usual.") | |
print(f"\n git clone {url}") | |
print("") |
That's very legacy as well. Who wants to clone models nowadays? 😄
self.type = self.args.type | ||
self.author = self.args.author | ||
self.search = self.args.search |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be moved to __init__
and typed.
print("\nYour repos:") | ||
for repo in repos: | ||
print(f" {ANSI.bold(repo.id)}") | ||
print("") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
print("\nYour repos:") | |
for repo in repos: | |
print(f" {ANSI.bold(repo.id)}") | |
print("") | |
for repo in repos: | |
print(repo.id) |
I would go straight to the point => only print the ids. The advantage of not being verbose is that the output can be piped to another command directly.
For example:
huggingface-cli repo list --author=Wauplin | wc -l
=> counts the number of models on the Hub for author "Wauplin".
This is much harder to do when there's more colors/formatting.
class RepoToggleCommand(BaseRepoCommand): | ||
def run(self): | ||
self.private = self.args.private | ||
user = self._api.whoami(self.token)["name"] | ||
namespace = self.args.organization if self.args.organization is not None else user |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking of this... maybe huggingface-cli repo toggle should be huggingface-cli repo edit [] --visibility public / private like in GitHub gh repo edit?
That's a good question. I would personally be in favor of
huggingface-cli repo set-public Wauplin/my-cool-model
huggingface-cli repo set-private Wauplin/my-cool-dataset --repo-type="dataset"
This way set-public
and set-private
are explicit commands. Both commands can share the same logic I think. Looking at gh
CLI is a good idea to take inspiration from. In this special case I still think we'll have less possible settings to play so being explicit is fine.
user = self._api.whoami(self.token)["name"] | ||
namespace = self.args.organization if self.args.organization is not None else user | ||
|
||
repo_id = f"{namespace}/{self.args.name}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doing a whoami
to get the namespace (if not provided) by the user is really useful here as it cannot be implicitly used (in opposition to create_repo/delete_repo)! So good to do it this way :)
References #1656
Key features
huggingface-cli repo list
: List repos on huggingface.co, with additional filter for author and string matchhuggingface-cli repo delete
: Delete a repo on huggingface.cohuggingface-cli repo toggle
: Toggle a repo on huggingface.co private or publichuggingface-cli repo create
fromuser.py
to the repo focussedrepo.py
within thesrc/huggingface_hub/commands
folder