Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DRAFT - Add Trivia.commentValues for sanitised comments #2578

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

adammcarter
Copy link
Contributor

@adammcarter adammcarter commented Mar 29, 2024

Motivation

Original issue here: #1890

Currently when pulling comments out of Trivia or TriviaPieces we get the comment delimiters such as //, /* */ etc. then we have to manually remove these to get the underlying contents of the comments.

Additionally, when extracting multiple TriviaPieces we might want to then concatenate the values in to one string, stripping any new lines too.

Proposed solution

As proposed in the original issue, this PR introduces a commentValues property on Trivia which does all of the above for us to produce a nice looking comment string, regardless of whether we span multiple lines or TriviaPieces

There are some edge cases where, for example you might have...

  • Block comments with/without leading * on new lines in between start/end delimiters. Here we strip the delimiters and new lines to concatenate the string
  • A line comment within a comment block. In this case we keep the comment delimiters eg // within the inner comment and just concatenate the comment within the block comment
  • As above re embedded comments, but spanning multiple trivia pieces. In this case we have the same outcome as above

For other real life use cases they should be covered in the unit tests inside of TriviaTests.swift, but if there's any I've missed that you can think of feel free to let me know and I can add them in too

This holds an array of comments made from the trivia's pieces, with
one line per element of the array
@adammcarter adammcarter force-pushed the adammcarter/trivia-piece-comment-value branch 2 times, most recently from b6d106d to 4844ec7 Compare March 29, 2024 15:56
Blocks of comments are now concatenated with their related lines within
the same block comment and added to the commentValues property as one
stripped, concatenated string

If there are n code blocks, there will be n stripped, concatenated elements
in the commentValues (plus x line comments)
@adammcarter adammcarter force-pushed the adammcarter/trivia-piece-comment-value branch from 4844ec7 to e8cd9f9 Compare March 29, 2024 15:59
This also keeps comment lines within comment blocks with their leading
comment markers in tact
@adammcarter adammcarter force-pushed the adammcarter/trivia-piece-comment-value branch from e5718e0 to 36df0fc Compare March 29, 2024 18:50
Copy link
Member

@ahoppen ahoppen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for setting up the PR @adammcarter. I have some initial review comments inline.

/// The string contents of all the comment pieces with any comments tokens trimmed.
///
/// Each element in the array is the trimmed contents of a line comment, or, in the case of a multi-line comment a trimmed, concatenated single string.
public var commentValues: [String] {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I would prefer if this returned a single string instead of an array of strings because I would expect that’s what most clients would want to work with. It’s always possible to create the array by splitting the string on \n.


Could you also add an entry to Release Notes/600.md?


for piece in pieces {
switch piece {
case .blockComment(let text), .docBlockComment(let text):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I’m not mistaking completely a blockComment and docBlockComment will always span from the opening /* to the closing */. So I don’t think we need all the foundStartOfCodeBlock/foundEndOfCodeBlock/isInCodeBlock handling.

Comment on lines +342 to +347
/**
* Some doc block comment
* // spread on many lines
* with a line comment
*/
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it’s quite common for the * to be indented by a single space to align with the first * of /*. Could you also add a test case for something like

          /**
           *  Some doc block comment
           *  with a line comment
          */
          """

let sanitized =
text
.split(separator: "\n")
.map { $0.trimmingAnyCharacters(in: "/*").trimmingAnyCharacters(in: " ") }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would also trim any slashes at the start of a line. And I don’t think we want that. Eg. in

/* 
/abc
*/

I would expect the comment value to be /.

I would just trim the following:

  • At the start of the comment: /* or /**
  • Inside the block comment: Any number of spaces followed by an optional *
  • At the end of the comment: */

text
.split(separator: "\n")
.map { $0.trimmingAnyCharacters(in: "/*").trimmingAnyCharacters(in: " ") }
.filter { !$0.isEmpty }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this drop empty lines in the comment? Eg. in the following? Because I don’t think we want to. For example if you interpret the comment as markdown, empty lines have meaning.

/*
Paragraph 1

Paragraph 2
*/

.split(separator: "\n")
.map { $0.trimmingAnyCharacters(in: "/*").trimmingAnyCharacters(in: " ") }
.filter { !$0.isEmpty }
.joined(separator: " ")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly here, I don’t think we should concatenate lines using a space. Again, if interpreting a doc comment as markdown, the newlines have meaning.

@@ -69,4 +69,325 @@ class TriviaTests: XCTestCase {
XCTAssertNotEqual(TriviaPiece.unexpectedText("e"), .unexpectedText("f"))
XCTAssertNotEqual(TriviaPiece.unexpectedText("e"), .lineComment("e"))
}

func testTriviaCommentValues() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it’s fine if we tailor commentValue to the kind of trivia that the parser produces. If we do that we can construct the Trivia from a string literal and have something like the following:

func assertCommentValue(
  _ trivia: Trivia,
  commentValue: String,
  file: StaticString = #filePath,
  line: UInt = #line
) {
  XCTAssertEqual(trivia.commentValue, commentValue, file: file, line: line)
}

func testExample() {
  assertCommentValue(
    "// line comment",
    commentValue: "line comment"
  )
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants