Migrate from spansql to memefish #66

morikuni · 2024-12-17T06:29:21Z

fix: #65

There are some DDLs that cannot be parsed by spansql, so I migrated to memefish while trying to maintain as much compatibility as possible. Due to differences in SQL formatting between spansql and memefish, there are significant diffs in the outputs. However, I believe the functionality remains unchanged.

spqnsqlではパースできないDDLがあるので、できるだけ互換性を保ったままmemefishに移行した。spansqlとmemefishのSQLのフォーマットの違いにより、diffに差分が大きく出ているが、動作は変わっていないと思う。

apstndb

I think there is no wrong implementation.
memefish itself doesn't target usecase as SQL builder, so I think dynamic SQL is OK.
I expects there is some children nodes are not handled, but it will be handled in issues and following PRs.

daichirata

Great work! Thank you for the changes to memefish. I just left comments on a few minor parts.

internal/hammer/diff.go

apstndb · 2024-12-23T13:20:32Z

internal/hammer/diff.go

-			m[stmt.Name] = t
-		case *spansql.CreateIndex:
-			if t, ok := m[stmt.Table]; ok {
+			m[stmt.Name.SQL()] = t


FYI
(*ast.Path).SQL() emits quoted identifier like

`group`.`table`

so it may be better to check len(path.Idents()) == 1 and add test cases with named schemas and names which is the same to keywords.

What is expected after checking len(path.Idents()) == 1 ?
I think the keys of m := make(map[string]*Table) can include identifiers with named schemas, not just table names.

add test cases with named schemas and names which is the same to keywords.

I added test cases for named schema and table name that is keywords! (Although, CREATE SCHEME is not yet supported with hammer).

What I want to say is that originally hammer did not support named schema,
but do you really want to support named schema in this memefish migration PR?
If it is broken, I recommend that you reject the general path for now and consider fixing it in a separate PR. If it is possible to do so, I think it is fine to do it in this PR.

As a result, named schema may be supported, but it only means that the identifier is now treated as an identifier, even if the identifier is just the table name or has a named schema.

I don't know if the original spansql.ID included named schema or not. However, I believe that the intention of the code here is not to use the table name, but to use the identifier. There may be differences in behaviour for named schema, but the intent seems to be consistent and backwards compatible.

We could intentionally remove support for named schema, but then we would have more code to remove named schema, including indexes and so on. IMO, I think that if the case of len(path.Idents()) == 3 were to appear in the future, we wouldn't have to exclude it here.

Anyway, I leave the final decision to @daichirata.

FYI
m[stmt.Name.SQL()] can be m["`group`.`order`"] for CREATE TABLE `group`.`order` (group and order are keyword.).

But currently, GetDatabaseDdl will emit CREATE TABLE `group.order` for that table definition so the map will contain m["`group.order`"].

https://issuetracker.google.com/issues/385901554

I believe this inconsistency can break diffs using spanner: prefix of hammer.

I recommend to unquote path identifiers as likestrings.Join(path.Name.Idents[].Name, ".")(pseudo code).

IMO, I can't understand your opinion that path.SQL() is suitable for map key than dot separated raw identifiers.

Quoted identifiers are only GoogleSQL lexical representation for reserved identifiers.
In other places like INFORMATION_SCHEMA, no quoted idenfires are used.

This reminds me of the slogan "サニタイズ言うな" (Don't say sanitize) that was popular in the Japanese engineering community in the past.
In this slogal says, we shouldn't store escaped representation, escaping (using .SQL()) should be performed only in last one mile.

However, it's clear that we won't be addressing named schema support in this PR. Regarding how to handle map keys, I'll defer to the owner's review on that.

Thanks for the good point about named schema!

Therefore, there shouldn’t be any breaking changes until hammer supports the CREATE SCHEMA DDL, and we can wait for GetDatabaseDdl to be fixed before adding support for CREATE SCHEMA.

I agree with this opinion. Since the CREATE SCHEMA statement cannot currently be parsed, I believe this issue does not manifest for now.

The key of this map is not required to be correct Cloud Spanner table name (or table identifier), but just an identifier of the table.
It means we can also use anything that is "same tables output the same, and different tables output different", like "<schema>.<table>", [2]string{"<schema>", "<table>"}, or even "<schema>🔧<table>".
So, at least me, I don't care how actually SQL() format the table names. I just expect SQL() outputs the same string for the same table (and ASTs generally satisfies that).

I agree with both sides on many points. But I also thought that even if it is actually useful, the better case is that the table identity is not affected by the SQL character representation.

If you were to write code on the hammer side that performs joins each time using a map with Idents as keys, would the scope of changes become significant? @morikuni

I pushed a commit to stop usingSQL() for comparison: 18a5bd3

Revert it if not needed.

apstndb · 2025-01-10T12:07:09Z

go.mod


 require (
 	cloud.google.com/go/spanner v1.62.0
+	github.com/cloudspannerecosystem/memefish v0.0.0-20241219043423-1efca7ff9732


Maybe it is better to use released version

Migrate from spansql to memefish

3e445b0

morikuni force-pushed the master branch from 5cf00da to 3e445b0 Compare December 17, 2024 06:45

apstndb reviewed Dec 17, 2024

View reviewed changes

morikuni mentioned this pull request Dec 17, 2024

Support Search Index #67

Draft

Update memefish to the latest version

52a5ee0

daichirata reviewed Dec 19, 2024

View reviewed changes

internal/hammer/diff.go Outdated Show resolved Hide resolved

internal/hammer/diff.go Outdated Show resolved Hide resolved

morikuni added 2 commits December 19, 2024 21:46

Fix typo

d2f03e5

Remove unnecessary nil check

dce9176

morikuni requested a review from daichirata December 19, 2024 12:47

apstndb reviewed Dec 23, 2024

View reviewed changes

morikuni added 3 commits December 25, 2024 14:26

Add test case for named schema and keyword identifier

1aa78b6

Show actual statement if it was not supported

5e6ece4

Stop using SQL() to compare identifiers

18a5bd3

apstndb reviewed Jan 10, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Migrate from spansql to memefish #66

Migrate from spansql to memefish #66

morikuni commented Dec 17, 2024

apstndb left a comment

daichirata left a comment

apstndb Dec 23, 2024

morikuni Dec 25, 2024

apstndb Dec 25, 2024

morikuni Dec 25, 2024

apstndb Dec 25, 2024 •

edited

Loading

apstndb Dec 26, 2024

daichirata Dec 26, 2024

morikuni Dec 26, 2024 •

edited

Loading

daichirata Dec 26, 2024 •

edited

Loading

morikuni Dec 26, 2024

apstndb Jan 10, 2025

Migrate from spansql to memefish #66

Are you sure you want to change the base?

Migrate from spansql to memefish #66

Conversation

morikuni commented Dec 17, 2024

apstndb left a comment

Choose a reason for hiding this comment

daichirata left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

apstndb Dec 25, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

morikuni Dec 26, 2024 • edited Loading

Choose a reason for hiding this comment

daichirata Dec 26, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

apstndb Dec 25, 2024 •

edited

Loading

morikuni Dec 26, 2024 •

edited

Loading

daichirata Dec 26, 2024 •

edited

Loading