-
Notifications
You must be signed in to change notification settings - Fork 27
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Functionally this change seems fine, but I have a couple of comments for discussion.
@@ -245,6 +245,11 @@ func RolesOf(n nodes.Node) role.Roles { | |||
} | |||
|
|||
// TokenOf is a helper for getting node token (see KeyToken). | |||
// | |||
// The token is an exact code snippet that represents a given AST node. It only works for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not necessarily in this change, but we should think about how to document expectations for the user of the UAST, about which nodes can be expected to have exact snippets and which do not. (For example: Do numeric literals have one? I think so, but the reader could not be sure, and I couldn't find a table of rules anywhere).
Anyway, I think that's something we should figure out how to document.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Relatedly: I'm worried about the expectations set up by the native/annotated vs. semantic split: If someone calls TokenOf(n)
on a Semantic node n
, will they get ""
always? Or will it just magically work sometimes and not others? Or will it work but return canonicalized text?
For the caller, I think that disparity is going to be frustrating. Maybe we could fix it by making the API distinguish ("TokenOf
always returns an empty string in Semantic mode") or maybe we should just have a single API (TextOf
?) whose return value is canonicalized or not depending on mode. I don't know the right answer here—and we don't necessarily need to figure it out right now—but I am concerned that we are adding to the API surface only to satisfy a particular use case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TokenOf
by definition should always either return a code snippet or nothing. Unfortunately defining all the node types that will have the token is impossible since it differs between the languages even for trivial nodes like string literals (C# has no token in the literal node, but has a separate token node).
In general, I plan for TokenOf
to always return a code snippet for any native AST. But it needs access to the source file to do it and a different kind of API. In SDK v3 I think it makes sense to replace all node types with opaque *Node
that will also keep a pointer to the original source file, so we can always get snippets based on the node position.
Regarding Semantic nodes, it's a bit tricky. We can still get the snippet based on positions, but Semantic nodes might have a completely different structure and the snippet for inner node might be larger than a snipped for the parent node. Because of this, it may make sense to give a different API promise. We may allow storing both Native and Semantic nodes (+ the source) and clearly communicate in docs that to get the code snippet (token) for a subtree, the user will first need to jump from Semantic node to associated Native. This will may help to draw a clear and easy to understand boundary for the end user.
For the ContentOf
, as explained in the original issue, it's not meant to be very precise in contrast to TokenOf
. It's just a quick way to get any text content for a specific node, thus it will try to first use any canonical text field of Semantic nodes, and if they are not available - fallback to TokenOf
. So they really serve a different purpose.
Signed-off-by: Denys Smirnov <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see the motivation for this helper, have read the code but have not really understood how this works.
And here is my +1 not to block the merge 😆
On the serious note though - it's completely not clear to me how a user is expected to discover this, esp with regarding the difference in annotated/semantic modes 😕
I see that there is no understanding in the way how it works and/or the difference between the two, so let's discuss it in more details. The change may wait a bit. |
Makes sense to me. Also, form the API user's perspective, most probably we should have a clear shared understanding on how developer's experience of using go-client with these helpers is "transferable" to using python and jvm client APIs. I did not have a chance to dig deeper yet, but just assume that the user needs will be the same (like #376) but the solution for different clients might be different. |
Right, but this is change to SDK to allow this kind of functionality in general. Client libraries may still expose/document it differently. So, is there any suggestions for the better naming, better docs, or anything else that will clarify the API contract for those two functions? |
@bzz @creachadair If there are no specific suggestions here, I'm merging this. |
I think it's OK to go ahead and merge it. The more general issue of how to go back and forth between the UAST structure and the underlying tokens & text is a larger one we'll have to continue to think about, but this solves an existing problem. |
yes, but this change is clearly driven by the go-client user feature request. My only concern is that we basically keep "enhancing" only one language client with such changes, and the only suggestion is not to just do it, but also to have some shared bigger plan on how do we make things like that exposed in other clients and how do we manage these expectations for there users. E.g shall we have an issue somewhere, documenting the need of exposing this to clients so users beyond go could benefit from that? |
The general notion is to port/expose all the features of the Go client to other clients as well, assuming some adaptation to the target language. As a note, I consider everything exposed by Opened an issue for this: #387 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👏 thank you for clarifications, that makes perfect sense now!
Define a
uast.ContentOf
helper as requested in #376.Fixes #376
Signed-off-by: Denys Smirnov [email protected]