Skip to content
This repository has been archived by the owner on Mar 8, 2020. It is now read-only.

Support conversion to Unicode line-column positions #405

Merged
merged 7 commits into from
Jun 6, 2019

Conversation

dennwc
Copy link
Member

@dennwc dennwc commented Jun 5, 2019

Support conversions from byte offset to Unicode/UTF-16 byte line-column pairs. Also, refactor the code for consistency and add more test cases.

This allows using positional info generated by Babelfish to calculate Unicode positions used in code editors.

Required by bblfsh/libuast#102.


This change is Reviewable

@dennwc dennwc requested review from creachadair and bzz June 5, 2019 16:17
@dennwc dennwc self-assigned this Jun 5, 2019
Copy link
Contributor

@creachadair creachadair left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r1.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @bzz and @dennwc)


uast/transformer/positioner/positions.go, line 261 at r1 (raw file):

}

// LineCol returns a one-based line and col given a zero-based byte offset.

Suggestion: "…zero-based byte offset within the line."


uast/transformer/positioner/positions.go, line 291 at r1 (raw file):

		return -1, err
	}
	const minCol = 1

Is this right? I thought column offsets were 0-based.


uast/transformer/positioner/positions.go, line 410 at r1 (raw file):

}

// toUnicodeLineCol returns a one-based Unicode character line-col or a UTF-16 code unit line-col given a zero-based byte offset.

Here again—are the columns 0-indexed too?

Copy link
Member Author

@dennwc dennwc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @bzz, @creachadair, and @dennwc)


uast/transformer/positioner/positions.go, line 291 at r1 (raw file):

Previously, creachadair (M. J. Fromberger) wrote…

Is this right? I thought column offsets were 0-based.

They are 1-based. It looks like the function comment should be clearer.

I hoped the "and" in "one-based line and column" would imply this. "one-based line and one-based column" may be a bit too verbose.


uast/transformer/positioner/positions.go, line 410 at r1 (raw file):

Previously, creachadair (M. J. Fromberger) wrote…

Here again—are the columns 0-indexed too?

Would clarify the docs here as well.

Copy link
Contributor

@creachadair creachadair left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @bzz, @creachadair, and @dennwc)


uast/transformer/positioner/positions.go, line 291 at r1 (raw file):

Previously, dennwc (Denys Smirnov) wrote…

They are 1-based. It looks like the function comment should be clearer.

I hoped the "and" in "one-based line and column" would imply this. "one-based line and one-based column" may be a bit too verbose.

Ah, I see—I misunderstood this above as well. I would have expected 0-indexed columns, but do not mind that much either way as long as it's consistent. 😄

Copy link
Member Author

@dennwc dennwc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @bzz and @creachadair)


uast/transformer/positioner/positions.go, line 261 at r1 (raw file):

Previously, creachadair (M. J. Fromberger) wrote…

Suggestion: "…zero-based byte offset within the line."

Looking at this more carefully, the byte offset here is an absolute one (from the beginning of the file).


uast/transformer/positioner/positions.go, line 410 at r1 (raw file):

Previously, dennwc (Denys Smirnov) wrote…

Would clarify the docs here as well.

Seems to be resolved as well :)

Copy link
Contributor

@creachadair creachadair left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @bzz, @creachadair, and @dennwc)


uast/transformer/positioner/positions.go, line 261 at r1 (raw file):

Previously, dennwc (Denys Smirnov) wrote…

Looking at this more carefully, the byte offset here is an absolute one (from the beginning of the file).

Yes, I misunderstood. I think your suggestion below was a good one, though:

// LineCol returns a one-based line number and one-based column offset given
// a zero-based byte offset within the file.

There might be other people as easily confused as I am 😁

Copy link
Contributor

@bzz bzz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r1.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @creachadair and @dennwc)

Copy link
Contributor

@bzz bzz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @creachadair and @dennwc)

Copy link
Member Author

@dennwc dennwc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 1 of 2 files reviewed, 1 unresolved discussion (waiting on @bzz and @creachadair)


uast/transformer/positioner/positions.go, line 261 at r1 (raw file):

Previously, creachadair (M. J. Fromberger) wrote…

Yes, I misunderstood. I think your suggestion below was a good one, though:

// LineCol returns a one-based line number and one-based column offset given
// a zero-based byte offset within the file.

There might be other people as easily confused as I am 😁

Done.


uast/transformer/positioner/positions.go, line 410 at r1 (raw file):

Previously, dennwc (Denys Smirnov) wrote…

Seems to be resolved as well :)

Done.

Copy link
Contributor

@creachadair creachadair left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r2.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved


uast/transformer/positioner/positions.go, line 410 at r1 (raw file):

Previously, dennwc (Denys Smirnov) wrote…

Done.

Thank you! 👍

@dennwc dennwc merged commit c75dbe7 into bblfsh:master Jun 6, 2019
@dennwc dennwc deleted the unicode_line_col branch June 6, 2019 13:37
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants