-
Notifications
You must be signed in to change notification settings - Fork 15
feature request: provide Unicode positions #102
Comments
Not sure how generally applicable is that across all the drivers, but I know @creachadair rised this concern before and @dennwc did some work on it e.g at js driver, in order to make this happen bblfsh/javascript-driver#51 |
This was a reverse operation: Babelfish expects bytes positions, but JS was writing Unicode positions. The feature request makes total sense, but I think we can't add Unicode positions as a field - it will make UAST files even larger. So I propose to send the whole source file as a part of UAST. Then we can recalculate Unicode positions and even tokens on demand. |
It is also possible to add a parameter to the parse request that the client wants Unicode positions. Normally you don't need both at the same time. |
This makes sense. We can extend the parsing protocol with one more option for this. Still, the |
As per discussion on Slack (non-public link), this sounds like a "wontfix" for the SDK:
|
@dennwc Shall we move this request to the Python client then? As I wrote, somebody has to calculate the positions due to the "business logic" requirements, and I am not happy that this is currently the ML team's responsibility. I mean, we can say that this is wontfix for clever and legitimate technical reasons (which happen in 0.01% of all the cases though), but then something will break on our side, and the product will be broken, too. cc @smola |
@vmarkovtsev I only meant the SDK. This should probably be in clients or |
BTW I am perfectly fine if the positions calculator sometimes responds that it cannot infer Unicode positions reliably (because of crazy normalization, etc.) and returns |
The problem is only that we can't provide those positions in the UAST, but we can always calculate them separately. This way the user won't be able to change positions directly and we won't hit cases mentioned by @creachadair. |
@dennwc @creachadair how do you think, what would be the best place to move this issue? |
That seems like a reasonable approach. As long as we're careful not to make the API too specific to Python, we should be able to shift it into the shared library later. |
This will be done in |
Not sure why, but I don't have permissions to move it to @creachadair Can you please check that permissions on SDK, libuast, bblfshd, clients and drivers are the same? Thanks! |
Done. It turns out |
Signed-off-by: Denys Smirnov <[email protected]>
Signed-off-by: Denys Smirnov <[email protected]>
Signed-off-by: Denys Smirnov <[email protected]>
Signed-off-by: Denys Smirnov <[email protected]>
Signed-off-by: Denys Smirnov <[email protected]>
As we have recently found out, the positions of UAST nodes must be measured in bytes, not in runes. However, we require working with strings, so we had to build the conversion ourselves. This code is not straightforward.
Besides, all kinds of crazy things may happen, e.g. the line number can change.
I suggest adding the function to recalculate positions to Babelfish, either as an API extension or to the clients. It is not cool to calculate them by hand, and they are different from IDE (expected) positions anyway.
The text was updated successfully, but these errors were encountered: