-
Notifications
You must be signed in to change notification settings - Fork 653
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FEAT-#7203: Make sure modin works correctly with pandas, which uses pyarrow as a backend #7204
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…ndas dataframes Signed-off-by: Anatoly Myachev <[email protected]>
Signed-off-by: Anatoly Myachev <[email protected]>
Signed-off-by: Anatoly Myachev <[email protected]>
Signed-off-by: Anatoly Myachev <[email protected]>
Signed-off-by: Anatoly Myachev <[email protected]>
Signed-off-by: Anatoly Myachev <[email protected]>
… which uses pyarrow as a backend Signed-off-by: Anatoly Myachev <[email protected]>
Signed-off-by: Anatoly Myachev <[email protected]>
Signed-off-by: Anatoly Myachev <[email protected]>
Signed-off-by: Anatoly Myachev <[email protected]>
Signed-off-by: Anatoly Myachev <[email protected]>
Signed-off-by: Anatoly Myachev <[email protected]>
Signed-off-by: Anatoly Myachev <[email protected]>
Signed-off-by: Anatoly Myachev <[email protected]>
Signed-off-by: Anatoly Myachev <[email protected]>
Signed-off-by: Anatoly Myachev <[email protected]>
Signed-off-by: Anatoly Myachev <[email protected]>
Signed-off-by: Anatoly Myachev <[email protected]>
Signed-off-by: Anatoly Myachev <[email protected]>
Signed-off-by: Anatoly Myachev <[email protected]>
Signed-off-by: Anatoly Myachev <[email protected]>
Signed-off-by: Anatoly Myachev <[email protected]>
…sue7203 Signed-off-by: Anatoly Myachev <[email protected]>
|
anmyachev
requested review from
devin-petersohn,
mvashishtha,
RehanSD,
YarShev,
vnlitvinov,
dchigarev and
a team
as code owners
May 14, 2024 09:13
anmyachev
commented
May 14, 2024
# meaning that we shouldn't try computing a new dtype for this column, | ||
# so marking it as 'unknown' | ||
i: ( | ||
pandas.api.types.pandas_dtype(float) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC there is no point in filling nans with this value, since this will be done later in the code.
Signed-off-by: Anatoly Myachev <[email protected]>
Signed-off-by: Anatoly Myachev <[email protected]>
YarShev
reviewed
May 14, 2024
YarShev
reviewed
May 14, 2024
YarShev
reviewed
May 14, 2024
Co-authored-by: Iaroslav Igoshev <[email protected]>
Co-authored-by: Iaroslav Igoshev <[email protected]>
YarShev
reviewed
May 14, 2024
Signed-off-by: Anatoly Myachev <[email protected]>
YarShev
previously approved these changes
May 14, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Left some minor comments.
Signed-off-by: Anatoly Myachev <[email protected]>
YarShev
approved these changes
May 14, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What do these changes do?
Blocked by #7199The main idea is to disable type precomputation for a dataframe created using pyarrow backend for pandas. For this purpose, additional information obtained when creating internal
PandasDataframe
is used. Dataframes created from such a dataframe inherit this information. Enabling type precomputation occurs after the types are materialized, if the backend is default. This can also happen when callingastype
function. This solution is not the fastest in terms of performance, but it is the most reliable and easiest to support.flake8 modin/ asv_bench/benchmarks scripts/doc_checker.py
black --check modin/ asv_bench/benchmarks scripts/doc_checker.py
git commit -s
docs/development/architecture.rst
is up-to-date