-
-
Notifications
You must be signed in to change notification settings - Fork 125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH1033 Add overloads of engine for pd.read_json #1035
base: main
Are you sure you want to change the base?
GH1033 Add overloads of engine for pd.read_json #1035
Conversation
For |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little concerned about the misuse of ellipses with default arguments. Ellipses should only be used in an argument when the argument is optional. When you want a specific result to happen as a result of the argument being specified, you don't use ellipses. The overloads that require values to be specified (i.e., the ones without ellipses) should come before the ones that use ellipses. And the ones with ellipses should have "broad" types. So writing something like engine: Literal["pyarrow"] = ...
can't be correct, because the default value of engine
is ujson
, so if the stub is to work without specification of that parameter it would be engine: Literal["ujson", "pyarrow"] = ...
.
lines: Literal[True] = ..., | ||
chunksize: None = ..., | ||
compression: CompressionOptions = ..., | ||
nrows: int | None = ..., | ||
storage_options: StorageOptions = ..., | ||
dtype_backend: DtypeBackend | NoDefault = ..., | ||
engine: Literal["pyarrow"] = ..., |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the default value of lines
is False
, if you have ...
for both lines
and engine
, it means that you don't have to specify either. So I think you don't want the ellipses here on either argument.
@@ -72,6 +98,7 @@ def read_json( | |||
nrows: int | None = ..., | |||
storage_options: StorageOptions = ..., | |||
dtype_backend: DtypeBackend | NoDefault = ..., | |||
engine: Literal["pyarrow"] = ..., |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the ellipses here should be removed.
lines: Literal[True], | ||
chunksize: int, | ||
compression: CompressionOptions = ..., | ||
nrows: int | None = ..., | ||
storage_options: StorageOptions = ..., | ||
dtype_backend: DtypeBackend | NoDefault = ..., | ||
engine: Literal["pyarrow"] = ..., |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove ellipses
lines: Literal[True] = ..., | ||
chunksize: None = ..., | ||
compression: CompressionOptions = ..., | ||
nrows: int | None = ..., | ||
storage_options: StorageOptions = ..., | ||
dtype_backend: DtypeBackend | NoDefault = ..., | ||
engine: Literal["pyarrow"] = ..., |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment about the ellipses
check( | ||
assert_type( | ||
pd.read_json(dd, lines=True, engine="pyarrow"), | ||
pd.DataFrame, | ||
), | ||
pd.DataFrame, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should add a test with TYPE_CHECKING_INVALID_USAGE
that makes sure that we disallow lines=False
with engine="pyarrow"
.
assert_type()
to assert the type of any return value