Define a uast.ContentOf helper #377

dennwc · 2019-03-14T15:48:25Z

Define a uast.ContentOf helper as requested in #376.

Fixes #376

Signed-off-by: Denys Smirnov [email protected]

creachadair

Functionally this change seems fine, but I have a couple of comments for discussion.

uast/types.go

creachadair · 2019-03-14T17:45:43Z

uast/uast.go

@@ -245,6 +245,11 @@ func RolesOf(n nodes.Node) role.Roles {
 }

 // TokenOf is a helper for getting node token (see KeyToken).
+//
+// The token is an exact code snippet that represents a given AST node. It only works for


Not necessarily in this change, but we should think about how to document expectations for the user of the UAST, about which nodes can be expected to have exact snippets and which do not. (For example: Do numeric literals have one? I think so, but the reader could not be sure, and I couldn't find a table of rules anywhere).

Anyway, I think that's something we should figure out how to document.

Relatedly: I'm worried about the expectations set up by the native/annotated vs. semantic split: If someone calls TokenOf(n) on a Semantic node n, will they get "" always? Or will it just magically work sometimes and not others? Or will it work but return canonicalized text?

For the caller, I think that disparity is going to be frustrating. Maybe we could fix it by making the API distinguish ("TokenOf always returns an empty string in Semantic mode") or maybe we should just have a single API (TextOf?) whose return value is canonicalized or not depending on mode. I don't know the right answer here—and we don't necessarily need to figure it out right now—but I am concerned that we are adding to the API surface only to satisfy a particular use case.

TokenOf by definition should always either return a code snippet or nothing. Unfortunately defining all the node types that will have the token is impossible since it differs between the languages even for trivial nodes like string literals (C# has no token in the literal node, but has a separate token node).

In general, I plan for TokenOf to always return a code snippet for any native AST. But it needs access to the source file to do it and a different kind of API. In SDK v3 I think it makes sense to replace all node types with opaque *Node that will also keep a pointer to the original source file, so we can always get snippets based on the node position.

Regarding Semantic nodes, it's a bit tricky. We can still get the snippet based on positions, but Semantic nodes might have a completely different structure and the snippet for inner node might be larger than a snipped for the parent node. Because of this, it may make sense to give a different API promise. We may allow storing both Native and Semantic nodes (+ the source) and clearly communicate in docs that to get the code snippet (token) for a subtree, the user will first need to jump from Semantic node to associated Native. This will may help to draw a clear and easy to understand boundary for the end user.

For the ContentOf, as explained in the original issue, it's not meant to be very precise in contrast to TokenOf. It's just a quick way to get any text content for a specific node, thus it will try to first use any canonical text field of Semantic nodes, and if they are not available - fallback to TokenOf. So they really serve a different purpose.

Signed-off-by: Denys Smirnov <[email protected]>

bzz

I see the motivation for this helper, have read the code but have not really understood how this works.

And here is my +1 not to block the merge 😆

On the serious note though - it's completely not clear to me how a user is expected to discover this, esp with regarding the difference in annotated/semantic modes 😕

dennwc · 2019-03-15T16:18:12Z

I see that there is no understanding in the way how it works and/or the difference between the two, so let's discuss it in more details. The change may wait a bit.

bzz · 2019-03-19T10:33:22Z

Makes sense to me.

Also, form the API user's perspective, most probably we should have a clear shared understanding on how developer's experience of using go-client with these helpers is "transferable" to using python and jvm client APIs.

I did not have a chance to dig deeper yet, but just assume that the user needs will be the same (like #376) but the solution for different clients might be different.

dennwc · 2019-03-20T16:38:29Z

Right, but this is change to SDK to allow this kind of functionality in general. Client libraries may still expose/document it differently.

So, is there any suggestions for the better naming, better docs, or anything else that will clarify the API contract for those two functions?

dennwc · 2019-03-26T09:26:12Z

@bzz @creachadair If there are no specific suggestions here, I'm merging this.

creachadair · 2019-03-26T14:35:23Z

@bzz @creachadair If there are no specific suggestions here, I'm merging this.

I think it's OK to go ahead and merge it. The more general issue of how to go back and forth between the UAST structure and the underlying tokens & text is a larger one we'll have to continue to think about, but this solves an existing problem.

bzz · 2019-03-26T15:10:15Z

Right, but this is change to SDK to allow this kind of functionality in general. Client libraries may still expose/document it differently.

yes, but this change is clearly driven by the go-client user feature request. My only concern is that we basically keep "enhancing" only one language client with such changes, and the only suggestion is not to just do it, but also to have some shared bigger plan on how do we make things like that exposed in other clients and how do we manage these expectations for there users.

E.g shall we have an issue somewhere, documenting the need of exposing this to clients so users beyond go could benefit from that?

dennwc · 2019-03-26T15:44:06Z

The general notion is to port/expose all the features of the Go client to other clients as well, assuming some adaptation to the target language.

As a note, I consider everything exposed by uast package in SDK a "development preview". Once we expose it in the libuast or the Go client - it becomes a supported API. But this is not mentioned anywhere in the docs.

Opened an issue for this: #387

bzz

👏 thank you for clarifications, that makes perfect sense now!

dennwc self-assigned this Mar 14, 2019

dennwc requested review from creachadair and bzz March 14, 2019 15:48

creachadair reviewed Mar 14, 2019

View reviewed changes

uast: define a ContentOf helper; fixes bblfsh#376

df9e4af

Signed-off-by: Denys Smirnov <[email protected]>

dennwc force-pushed the content_of branch from 285e39a to df9e4af Compare March 15, 2019 13:33

dennwc requested a review from creachadair March 15, 2019 13:34

bzz approved these changes Mar 15, 2019

View reviewed changes

bzz approved these changes Mar 26, 2019

View reviewed changes

dennwc merged commit bbb1495 into bblfsh:master Mar 26, 2019

dennwc deleted the content_of branch March 26, 2019 15:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Define a uast.ContentOf helper #377

Define a uast.ContentOf helper #377

dennwc commented Mar 14, 2019

creachadair left a comment

creachadair Mar 14, 2019

creachadair Mar 14, 2019

dennwc Mar 15, 2019

bzz left a comment •

edited

Loading

dennwc commented Mar 15, 2019

bzz commented Mar 19, 2019

dennwc commented Mar 20, 2019

dennwc commented Mar 26, 2019

creachadair commented Mar 26, 2019

bzz commented Mar 26, 2019

dennwc commented Mar 26, 2019

bzz left a comment

Define a uast.ContentOf helper #377

Define a uast.ContentOf helper #377

Conversation

dennwc commented Mar 14, 2019

creachadair left a comment

Choose a reason for hiding this comment

creachadair Mar 14, 2019

Choose a reason for hiding this comment

creachadair Mar 14, 2019

Choose a reason for hiding this comment

dennwc Mar 15, 2019

Choose a reason for hiding this comment

bzz left a comment • edited Loading

Choose a reason for hiding this comment

dennwc commented Mar 15, 2019

bzz commented Mar 19, 2019

dennwc commented Mar 20, 2019

dennwc commented Mar 26, 2019

creachadair commented Mar 26, 2019

bzz commented Mar 26, 2019

dennwc commented Mar 26, 2019

bzz left a comment

Choose a reason for hiding this comment

bzz left a comment •

edited

Loading