New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Replace UDFs/UDAs with Spark's Catalog #361

Open

okennedy opened this issue Dec 19, 2019 · 0 comments

Labels

backend compiler eventually

Member

okennedy commented Dec 19, 2019

At present, User-defined functions (UDFs) and User-defined aggregates (UDAs) can be defined either in Mimir-land or in Spark-land. Moreover,

Spark's UDA/UDF catalog implementation is virtually identical to Mimir's
There's a mountain of libraries that already support spark
Function and aggregate management is a non-trivial 1k lines of code (or more).

I propose that we defer to Spark's catalog to cut out a ton of redundant code from Mimir. This would require the following changes:

RAToSpark: Could now directly use the Spark catalog to instantiate functions (see the new MimirSQL for a few examples on how this might work)
Typechecker: Would need to use Spark's catalog to check types. This could get a little awkward, since Spark's and Mimir's typesystems differ. Would probably require RAToSQL to handle some translations.
Eval / EvalInline: Would now talk Spark for function execution

okennedy added backend compiler labels

okennedy self-assigned this

okennedy mentioned this issue

Replace typesystem with Spark-/Hive- types #367

Open

5 tasks

okennedy added the eventually label

okennedy mentioned this issue

Sample operator #368

Merged

okennedy removed their assignment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment