-
Notifications
You must be signed in to change notification settings - Fork 13
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please expand the description before merging to say what cases this PR is fixing?
This looks fine so far, but since it's marked WIP I'll hold off on approving till you're ready.
758380a
to
e5d8d05
Compare
Rebased on laster master and early feedback amended. Right now it's still WIP as it only contains the reproducible example + tests but no a fix yet. |
Signed-off-by: Alexander Bezzubov <[email protected]>
Signed-off-by: Alexander Bezzubov <[email protected]>
5c17858
to
13d607f
Compare
rebased on latest master |
As we now have discovered many edge cases where escape sequence handing in the native driver is very different from Go - current approach from #64 does not seem to scale well, so a proposed solution that we discussed with @dennwc yesterday is:
That would allow us to avoid re-implementing complex JS escape sequence handling on the Go side. Although this seems to be consistent with initial intent #32 (comment) and seems to be the case for other drivers, I'm not sure how would that affect client's expectations (e.g see #32 and esp bblfsh/sdk#291) and want all of us to be on board, so \cc @creachadair and @dennwc |
That seems reasonable to me. I also like about it that its easy to understand which to expect: If you're upstream of canonicalization, you get the ambient encoding, downstream you get the normalized form. Does this mean, though, that we will have to explicitly define the string encoding rules we use inside the canonical AST? (I didn't find it anywhere, and it looks like right now we implicitly assume Go's rules). |
I guess that is something I'm trying to understand here as well. After a series of experiments, I learned that go, java, bash, cpp, csharp are the only drivers where the original quotas used for literals are preserved in annotated mode. php and python do not preserve that information and thus |
The format below is:
modes. go
java
javascript
bash
cpp
const char* b = 'b';
const char* bc = "b\nc";
csharp
php
python
|
Everything else looks good. In Annotated the string should match the source file (including quotes). In Semantic there should be no quotes and the string should be unescaped according to the language rules (same as if you print it from that language). |
I'm using this as kind-of umbrella issue for these type of problems, so So, the same approach will be used as solution for:
The proposed solution:
|
Fixes bblfsh#75 Annotated: - New field added to the string literals `normValue` It is used as value only in Semnatic mode. Explicit Go-side strconv between annotated and semantic is not neeed this way. Signed-off-by: Alexander Bezzubov <[email protected]>
Signed-off-by: Alexander Bezzubov <[email protected]>
Fix applied in 683324b :
All fixtures updated in fbed224. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One small change required but looks good otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, exactly what I had in mind, thanks!
Signed-off-by: Alexander Bezzubov <[email protected]>
fc57c53
to
f20fe64
Compare
Thank you every one for reviews! Merging after CI is green |
Fixes the rest of the #75 - case of
check: key "value": invalid syntax ("\ ")"
Details of the applied solution are described in #81 (comment)