Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace UDFs/UDAs with Spark's Catalog #361

Open
okennedy opened this issue Dec 19, 2019 · 0 comments
Open

Replace UDFs/UDAs with Spark's Catalog #361

okennedy opened this issue Dec 19, 2019 · 0 comments

Comments

@okennedy
Copy link
Member

At present, User-defined functions (UDFs) and User-defined aggregates (UDAs) can be defined either in Mimir-land or in Spark-land. Moreover,

  1. Spark's UDA/UDF catalog implementation is virtually identical to Mimir's
  2. There's a mountain of libraries that already support spark
  3. Function and aggregate management is a non-trivial 1k lines of code (or more).

I propose that we defer to Spark's catalog to cut out a ton of redundant code from Mimir. This would require the following changes:

  1. RAToSpark: Could now directly use the Spark catalog to instantiate functions (see the new MimirSQL for a few examples on how this might work)
  2. Typechecker: Would need to use Spark's catalog to check types. This could get a little awkward, since Spark's and Mimir's typesystems differ. Would probably require RAToSQL to handle some translations.
  3. Eval / EvalInline: Would now talk Spark for function execution
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant