Track the maximum partition size in dataset types #145

facundominguez · 2018-09-11T22:50:01Z

Spark programs scale as long as the partition sizes of the inputs are bounded. This issue is to explore whether it would be possible to track in the type of datasets which is the maximum partition size of its partitions. This way, the type of an algorithms could ensure that the algorithm doesn't grow the partitions or doesn't grow them beyond some constant factor of the partitions sizes of the input.

Could also be a nice application for liquid haskell.

facundominguez changed the title ~~Track the maximum partition size in dataset sizes~~ Track the maximum partition size in dataset types Sep 11, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Track the maximum partition size in dataset types #145

Track the maximum partition size in dataset types #145

facundominguez commented Sep 11, 2018 •

edited

Loading

Track the maximum partition size in dataset types #145

Track the maximum partition size in dataset types #145

Comments

facundominguez commented Sep 11, 2018 • edited Loading

facundominguez commented Sep 11, 2018 •

edited

Loading