Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get query's tables and columns #3

Merged
merged 15 commits into from
Feb 20, 2024
26 changes: 26 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,3 +11,29 @@ Macaw is a limited Clojure wrapper for
[JSqlParser](https://github.com/JSQLParser/JSqlParser). Similar to its parrot
namesake, it's intelligent, can be taught to speak SQL, and has many colors
(supports many dialects).


## Building

To build a local JAR, use

```
clj -T:build jar
```

This will create a JAR in the `target` directory.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be expanded once I set up Clojars.


## Working with the Java files

To compile the Java files, use

```
clj -T:build compile
```

If you're working on Macaw and make changes to a Java file, you must:

1. Recompile
2. Restart your Clojure REPL
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there no way to improve this workflow in 2024? 😢 (outside scope of this PR)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

possibly with the right ns refresh, but I haven't worked it out :(


for the changes to take effect.
47 changes: 47 additions & 0 deletions build.clj
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
;; Further info: https://clojure.org/guides/tools_build#_mixed_java_clojure_build

(ns build
(:require
[clojure.java.shell :as sh]
[clojure.string :as str]
[clojure.tools.build.api :as b]))

(def lib 'metabase/macaw)

(def major-minor-version "0.1")

(defn commit-number []
(or (-> (sh/sh "git" "rev-list" "HEAD" "--count")
:out
str/trim
parse-long)
"9999-SNAPSHOT"))
tsmacdonald marked this conversation as resolved.
Show resolved Hide resolved

(def version (str major-minor-version \. (commit-number)))
(def target "target")
(def class-dir (format "%s/classes" target))

(def jar-file (format "target/%s-%s.jar" (name lib) version))

(def basis (delay (b/create-basis {:project "deps.edn"})))

(defn clean [_]
(b/delete {:path target}))

(defn compile [_]
(b/javac {:src-dirs ["java"]
:class-dir class-dir
:basis @basis
:javac-opts ["--release" "11"]}))

(defn jar [_]
(compile nil)
(b/write-pom {:class-dir class-dir
:lib lib
:version version
:basis @basis
:src-dirs ["src"]})
(b/copy-dir {:src-dirs ["src" "resources"]
:target-dir class-dir})
(b/jar {:class-dir class-dir
:jar-file jar-file}))
6 changes: 3 additions & 3 deletions deps.edn
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
{:paths
["src" "resources"]
["src" "java" "resources" "target/classes"]

:deps
{com.github.jsqlparser/jsqlparser {:mvn/version "4.8"}} ; The actual SQL Parser to wrap!
Expand Down Expand Up @@ -41,8 +41,8 @@
{:deps {com.github.camsaul/whitespace-linter {:sha "e35bc252ccf5cc74f7d543ef95ad8a3e5131f25b"}}
:ns-default whitespace-linter
:exec-fn whitespace-linter/lint
:exec-args {:paths ["deps.edn" "src" "test" ".github"]
:include-patterns ["\\.clj[cs]?$" "\\.edn$" "\\.yaml$" "\\.md$"]}}
:exec-args {:paths ["deps.edn" "src" "java" "test" ".github"]
:include-patterns ["\\.clj[cs]?$" "\\.edn$" "\\.java$" "\\.yaml$" "\\.md$"]}}

;; Run tests
;;
Expand Down
48 changes: 48 additions & 0 deletions java/com/metabase/SqlVisitor.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
package com.metabase.macaw;

import net.sf.jsqlparser.schema.Column;
import net.sf.jsqlparser.schema.Table;

/**
* Clojure is not good at working with Java Visitors. They require over<em>riding</em> various over<em>loaded</em>
* methods and, in the case of walking a tree (exactly what we want to do here) can be counted on to call `visit()`
* recursively.
*
* Clojure's two main ways of dealing with this are `reify`, which does not permit overloading, and `proxy`, which does.
* However, creating a proxy object creates a completely new object that does not inherit behavior defined in the parent
* class. Therefore, if you have code like this:
*
* <code>
(proxy
[TablesNamesFinder]
[]
(visit [visitable]
(if (instance? Column visitable)
(swap! columns conj (.getColumnName ^Column visitable))
(let [^StatementVisitor this this]
(proxy-super visit visitable)))))
</code>

* the call to `proxy-super` does <em>not</em> call `TablesNamesFinder.visit()` with the non-`Column` `visitable`; that
* definition has been lost.
tsmacdonald marked this conversation as resolved.
Show resolved Hide resolved
*
* <hr>
*
* Therefore, this interface was created to provide a more convenient escape hatch for Clojure. It removes the
* overloading requirement for the conventional visitor pattern, instead providing differently-named methods for each
* type. This lets Clojure code use `reify` to implement each method with the necessary behavior. The recursive
* tree-walking is handled by a different class, which calls the methods defined here along the way. Think of them as
* hooks for Clojure-land that don't affect the main behavior of the visitor.
*/
tsmacdonald marked this conversation as resolved.
Show resolved Hide resolved
public interface SqlVisitor {

/**
* Called for every `Column` encountered, presumably for side effects.
*/
public void visitColumn(Column column);

/**
* Called for every `Table` encountered, presumably for side effects.
*/
public void visitTable(Table table);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not the kind of interface I imagined. I'd prefer to get away from needing to extend or implement anything.

The idea I had was for there to a concrete class, your "copy" of TNF, with the major difference being that it takes a Map<String, Consumer<Visitable>> argument in its constructor.

For each overloading of visit, it would check if the given map has an entry for the corresponding type, and if so call it with its argument before calling the rest of its own traversal code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's my bad for misunderstanding.

I was going to start bikeshedding about whether this was all a moot point after the context/results object was added above, but on further reflection I think that's orthogonal.

I think the two approaches are maybe not so different? Here are the pros (in bold) and cons I can see:

SqlVisitor interface Map of fns
Type safety: can't forget or mis-type a visit___ method (was it :columns or :column?)
Avoids the faff of making a helper to turn fns into a Consumer instance. Although I suppose we could use IFn directly
verbose for the client code More Clojure-y interface; using reify is kind of a pain; avoids providing blank implementations for unneeded methods
The gather sugar is nice

I don't think there's a slam-dunk argument in either direction, so my bias is to leave it for now and use a) accumulating other things (we've done it…once!) or b) needing the query context as the opportunity to refactor as wanted?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making the table, here are my thoughts:

  1. "Can't forget a visit method" - the list we support may get quite long, painful to be exhaustive.
  2. "Was it :columns or :column" - spec / malli / kondo / assert to the rescue
  3. "Avoids the faff of turning fns to Consumer" - whoops, I thought Clojure magically made this work now. This is a Clojure oriented library so we could definitely use IFn in the map, and maybe even Keyword too, but I think converting to String before passing to the constructor will be more ergonomic.

For me the key thing is how pleasant this is to work with as we start to care about more kinds of leaves and clauses. I'm OK to go with what works and rework as we add more requirements.

Will give this a closer look tomorrow, just did a high level flyby. Also interested to hear others opinions before I commit to a tick 😄

Copy link
Collaborator

@crisptrutski crisptrutski Feb 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like this looks quite nice to me

(defn- kv->java [[k f]]
  [(name k)
   (reify java.util.function.Consumer
     (accept [_this item]
       (f item)))])

;; Malli stuff on the args, also forgotten what the visitor interface looked like
(defn walk-query [visitor-map parsed-query]
  (.run (MyFancyVisitor. (into {} (map kv->java) visitor-map)
        parsed-query)

Copy link
Collaborator

@crisptrutski crisptrutski Feb 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, just thought of a few more potential benefits to the "data oriented" approach:

  1. Fusion - we can (merge-with call-both) over all our walkers, and do a single walk for efficiency.
  2. Decoration - easy to wrap functions, e.g. for debugging or logging.
  3. Derivation - maybe we can derive more complete mappings from incomplete maps.
  4. Generalization - perhaps nice register functions for some group of entities in a hierarchy

I just love data, man 🌈

Copy link
Collaborator

@crisptrutski crisptrutski Feb 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought I'd heard something! The design and how they reached it is really interesting too.

Unfortunately it won't cover the case of pulling an IFn out of the map in Java, no automatic conversion there. It would make kv->java a one liner though, and lambda conversions look so nice and light compared to reify 😍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated! There's some Clojure lib-specific voodoo on the Java side, but that doesn't matter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think converting to String before passing to the constructor will be more ergonomic

I thought about this, but then the look of

{:column (fn [column] ...)}

was just too clean to pass up. So you see my Keyword->String shuffling in the ASTWalker constructor

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation is very much about making the Clojure code look as clean as possible, to the detriment of cute Java.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Was it :columns or :column" - spec / malli / kondo / assert to the rescue

Malli is on the radar, but…not in this PR ;)

Loading
Loading