Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rethink "eager" vs. "lazy" #473

Open
krlmlr opened this issue Jan 17, 2025 · 11 comments · May be fixed by #487
Open

Rethink "eager" vs. "lazy" #473

krlmlr opened this issue Jan 17, 2025 · 11 comments · May be fixed by #487
Milestone

Comments

@krlmlr
Copy link
Member

krlmlr commented Jan 17, 2025

Already used elsewhere: https://duckdb.org/2024/04/02/duckplyr.html#eager-vs-lazy-materialization

Perhaps auto_collect ? Or auto_mat ?

@krlmlr krlmlr added this to the 1.0.0 milestone Jan 17, 2025
@krlmlr
Copy link
Member Author

krlmlr commented Jan 18, 2025

Or strict ?

@krlmlr
Copy link
Member Author

krlmlr commented Jan 19, 2025

Or "cautious" and "daring"/"fearless"?

@krlmlr
Copy link
Member Author

krlmlr commented Jan 20, 2025

We should be able to materialize only as long as the result size does not reach a certain size (number of rows/cells/bytes):

  • bound (or perhaps bounded):
    • bound = TRUE
    • bound = c(cells = 1000)
    • bound = c(rows = 100)
  • tether:
    • tether = TRUE
    • tether = c(cells = 1000)
    • tether = c(rows = 100)

I'll go with "tether" because this is easy to search and replace later.

@maelle
Copy link
Collaborator

maelle commented Jan 20, 2025

@krlmlr so do you agree that with the current usage of "lazy" at the R and at the C levels, it sounds there are two levels of laziness? Did I understand this correctly?

I think aligning the terms, on the R interface, with what dtplyr/dbplyr do, makes it easier for users to pick it up.

@maelle
Copy link
Collaborator

maelle commented Jan 20, 2025

Maybe the opportunity to create a diagram 😸

@krlmlr
Copy link
Member Author

krlmlr commented Jan 20, 2025

Yes, there's eager vs. lazy (duckplyr is always lazy, dplyr is eager), and tethered vs. untethered (restrictions of automatic materialization, eager is always untethered by design).

@maelle
Copy link
Collaborator

maelle commented Jan 20, 2025

I think a diagram would help then! Happy to help with the vignette but will need to ask more questions.

@maelle
Copy link
Collaborator

maelle commented Jan 23, 2025

The word "tethered" here would explain something that is potentially surprising for users, and at least require some thought on their side. So I'd recommend choosing a word that's fairly common so that a non native speaker who maybe doesn't know "tethered" does not get one more obstacle on their way to understanding the concept. Also "tethered" might be easy to mistype (thetered etc).

Based on word frequency in Google's books corpus, "bound" sounds like a better choice but it has the drawback of being used for other things like limits.

https://books.google.com/ngrams/graph?content=tethered%2Crestrained%2Cbound%2Ctied%2Croped%2Cblocked%2Cinhibited%2Ctempered&year_start=1800&year_end=2022&corpus=en&smoothing=3

@maelle
Copy link
Collaborator

maelle commented Jan 23, 2025

I find "strict" not specific enough.

@maelle
Copy link
Collaborator

maelle commented Jan 23, 2025

"frugal" is probably not common enough.

@maelle
Copy link
Collaborator

maelle commented Jan 23, 2025

prudent like cautious

@krlmlr krlmlr linked a pull request Jan 24, 2025 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants