large data sets #1751
mtoy-googly-moogly
started this conversation in
What's Next: Language Proposals
large data sets
#1751
Replies: 1 comment 1 reply
-
"Default wheres" would be great, and going further, they could be set for specific fields. I mean, fact tables are essentially time-based, and very often, just a short time window is analyzed. Given this, by giving some users Malloy as a playground may lead them to an expensive query if they forget to filter the time - this is not a Malloy specific problem, but it certainly could help by preventing this. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Problem
Many data sets are so large that giving unfiltered query access creates the possibility of very expensive queries being constructed.
Solutions
Malloy does cost analysis of queries
Perhaps there would be metadata on a model or on a query indicating how much resources the user would be willing to allocate to a query.
Pre-filtering on large tables
This approach provides no guarantee that query limits will be constrained, but provides a mechanism for controlling one of the most common ways that a query can explode.
Another way to express a pre-filtered table would be to use the planned but not implemented parameterization
The question in pre-filtering is, are there places where the pre-filtering being an explicit property of a source would make it possible to write better code than having the source be more like a source template. Gestures like joining a pre-filtered source are the places where I am wondering if this matters.
Beta Was this translation helpful? Give feedback.
All reactions