Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

provideCompletionItems function is too big and requires splitting #896

Open
gavbarnett opened this issue Feb 7, 2021 · 14 comments · May be fixed by #905
Open

provideCompletionItems function is too big and requires splitting #896

gavbarnett opened this issue Feb 7, 2021 · 14 comments · May be fixed by #905
Labels
Area: Auto completion Code completion and suggestion, aka IntelliSense. Area: Meta Pertaining to build system, test system, infrastructure, code health, and the project itself. Issue: Task Not a bug. Even not a new feature. But we really need to do something.
Milestone

Comments

@gavbarnett
Copy link
Contributor

Proposal

provideCompletionItems is big at almost 200 lines, It could be broken down into 5 main sub-function (one per if-else branch).

In doing that some smaller functions may become clear for making common (file-scraping for links & image links might being one possibly).

The math functions have been mostly broken out already (mathEnvCheck / this.mathCompletions)

The main benefits would be:

  • readability, The if-else structure is easier to see
  • no need for large comment rectangles as using named functions
           /* ┌───────────────────────┐
              │ Reference link labels │
              └───────────────────────┘ */
                                //vs
            completionReferenceLinkLabels(args)
  • reduce code repetition
  • code testing
@Lemmingh Lemmingh added Area: Auto completion Code completion and suggestion, aka IntelliSense. Area: Meta Pertaining to build system, test system, infrastructure, code health, and the project itself. Issue: Task Not a bug. Even not a new feature. But we really need to do something. labels Feb 8, 2021
gavbarnett added a commit to gavbarnett/vscode-markdown that referenced this issue Feb 11, 2021
Moved all image path completion code from within the if-else statement to a seperate function as part of refactoring of provideCompletionItems(). yzhang-gh#896

No functional changes, purely structural.

Tested by adding a local images to a .md file and confirming image link completion is still functional
@gavbarnett
Copy link
Contributor Author

Just some thoughts for review.

Major function split

The following functions can be separated out from provideCompletionItems(document, position, _token, _context) in the first pass review of the large if-else statement.
Names can be reviewed, these all exist within MdCompletionItemProvider class

  • imageLinkCompletion()
  • mathCompletion()
  • linkReferenceLabelCompletion()
  • anchorTagsCompletion()
  • filePathsCompletion()

But this is a fairly simple view, but this should probably be the initial pull request to keep things clean. After that the following should be considered.


Discussion of functions

  1. Some functions use promises like linkReferenceLabelCompletion() and - anchorTagsCompletion() but others do not. Seems like there should be a consistent approach.

    1. Promises would seem to be useful for anything with a potentially large search such as imageLinkCompletion() & filePathsCompletion()
  2. It seems like imageLinkCompletion() & filePathsCompletion() overlap significantly so could either be combined or call a common function.

    1. Currently imageLinkCompletion() has handling for links with and without <> which filePathsCompletion() does not, but should.
    2. They have slightly different handling for removing space's (%20) from uris. They get the same result so it's just implementation details.
    3. imageLinkCompletion() limits search to only image file formats, as you would expect.
    4. imageLinkCompletion() adds an image preview, as you would expect..
  3. anchorTagsCompletion() has the following bracket completion code which no other function seems to do. It is also the only place where the variable lineTextAfter is used. If this is required then all functions should implement it using a common function.... but it's probably not needed.

    let addClosingParen = false;
    if (/^([^\) ]+\s*|^\s*)\)/.test(lineTextAfter)) {
        // try to detect if user wants to replace a link (i.e. matching closing paren and )
        // Either: ... <CURSOR> something <whitespace> )
        //     or: ... <CURSOR> <whitespace> )
        //     or: ... <CURSOR> )     (endPosition assignment is a no-op for this case)

        // in every case, we want to remove all characters after the cursor and before that first closing paren
        endPosition = position.with({ character: + endPosition.character + lineTextAfter.indexOf(')') });
    } else {
        // If no closing paren is found, replace all trailing non-white-space chars and add a closing paren
        // distance to first non-whitespace or EOL
        const toReplace = (lineTextAfter.search(/(?<=^\S+)(\s|$)/))
        endPosition = position.with({ character: + endPosition.character + toReplace });
        if (!linkRefDefinition) {
            addClosingParen = true;
        }
    }
  1. Currently these functions need checked in a very specific order to operate correctly. It seems like a better regex would be able to avoid that.
    1. Take all the link variations: [linkLabel], [linkLabel][], [linkLabel](link), [linkLabel](link#anchor), ![linkLabel], ![linkLabel][], ![linkLabel](link), [linkLabel]:(link), [linkLabel]:(link#anchor).. and others, could be identified with a single regex using named capture groups.
  2. Completions, for the most part, shouldn't apply inside code blocks or quotes. There should be a common function to ignore these.
  3. Other completions currently missing:
    1. info string in fenced code blocks (typically the code language) (javascript, C, python...)
      • Technically info strings use is not specified in CM or GFM, instead it states:

        Although this spec doesn’t mandate any particular treatment of the info string, the first word is typically used to specify the language of the code block

      • And this seems fairly well supported
      • GFM uses linguist to render code blocks, it's list of languages is available here
    2. Autolinks between < & >
      • GFM supports autolinks without < > but this would be harder to support. We'd have to recognize the path as valid
      • Worth noting that currently this extension doesn't render links inside < > as links.
    3. Setext heading underlines to match heading length see issue 794
    4. Tables
    5. Html tag completion
      • This could be a can of worms!
      • maybe start simple with:
        • comment tags `<--! comment -->

@Lemmingh
Copy link
Collaborator

My draft

Overview

+--------------------+
|       Engine       |
|                    |
| +----------------+ |
| | Document model | |
| +----------------+ |
|                    |
| +--------------+   |
| | Symbol table |   |
| +--------------+   |
+--------------------+
    |   ↑       |   ↑
    |   |       ↓   |
    |   |   +------------+   +---------+
    |   |   | Completion |←--| VS Code |
    |   |   |  provider  |--→| (User)  |
    |   |   +------------+   +---------+
    |   |       |   ↑
    ↓   |       ↓   |
+-------------------------------+
|           Providers           |
|                               |
| +---------------------+       |
| | Heading information |       |
| +---------------------+       |
|                               |
| +---------------------------+ |
| | Link reference definition | |
| +---------------------------+ |
|                               |
| +--------------------+        |
| | Workspace resource |        |
| +--------------------+        |
|                               |
| +----------------------+      |
| | Cross file reference |      |
| | and complex routing  |      |
| +----------------------+      |
|                               |
| +------+                      |
| | Math |                      |
| +------+                      |
+-------------------------------+
  1. VS Code requests completion items.
  2. Completion provider asks Engine for possible states at the cursor position.
  3. Completion provider asks Providers for information.
  4. Completion provider generates CompletionItems from each response, and merge results.
  5. Completion provider returns the results to VS Code.

Notes

  • All systems should be implemented with cooperative cancellation support.
  • An item under Providers may not be single module, but a group of modules, even external things.
  • Engine is critical. From our experience, a huge number of requests will hit it, and take most of the time.

@gavbarnett
Copy link
Contributor Author

gavbarnett commented Feb 15, 2021

Nice ASCII art! Much more rip the whole thing down kind of approach, but i understand your desire for a cleaner overall design.

How/when do you see the document model updating? Guessing it wouldn't update from scratch on each typed character but instead somehow update dynamically. Else it could get really slow on big files. Maybe you have a plan, or is this one of the tricky bits?

I guess you need to keep track of the line/char position of the entry & exit point for each document element. And have that quickly searchable.

@gavbarnett
Copy link
Contributor Author

I guess the document model extends outside the completion item scope and would be used as a basis for many other functions (for example you get syntax highlighting and checking almost for free).

The only thing that you wouldn't get is "beautify" functionality for tables etc. But it would make scoping such things more straightforward.

@yzhang-gh
Copy link
Owner

yzhang-gh commented Feb 16, 2021

Thanks.

@Lemmingh's design can be the long-term target which, as @gavbarnett said, goes somehow beyond the "completion" feature.

Specific to this issue, what we should do now is

  • The splitting of different types of completions ("image", "math", "link" etc.)

    After splitting the main function only contains the code to determine which type of completions to show. Now we are using regexp, in the (far) future we can consult the Engine.

  • The unification of underlying "services" (or Providers)

    e.g. "Workspace resource" is used by both imagePath and filePath,
    "Heading information" is not only used by the "completion" feature but also the TOC.

By doing this, we pave the way for replacing the regexp-based solution with the Engine/Providers one in the future. (We don't have to do all these in one go.)

@gavbarnett
Copy link
Contributor Author

gavbarnett commented Feb 16, 2021

May still be worth defining the interfaces a bit here before we start.

@gavbarnett
Copy link
Contributor Author

Have made a start at this in my local master branch https://github.com/gavbarnett/vscode-markdown/blob/master/src/completion.ts

Just the simple step of splitting out the high level provider functions. Not sure I fully get how best to use the async cancel token (without changing too much else) but I've tried?

Still got the Math bit to do something with.
Then I'm going to extract the regex for each if-else as named variables so it's easier to read.

That's probably a good enough point to get to for the next Pull request.

Following that I'll look at extracting common provider parts.

@yzhang-gh
Copy link
Owner

You can open a draft PR so we can see the diff view and also may have some ideas 😉.

@Lemmingh
Copy link
Collaborator

Have made a start at this in my local master branch

yzhang-gh:388af61205180fa1e04565ca35f7e8f12cdde11b...gavbarnett:35f4fa4aca2030d958b22d81c6af98c10de5c789

Looks like you didn't refresh your working branch, so it's a bit hard to read the diff.


Not sure I fully get how best to use the async cancel token

You can take the decoration manager as a reference.
https://github.com/yzhang-gh/vscode-markdown/tree/388af61205180fa1e04565ca35f7e8f12cdde11b/src/theming

@gavbarnett
Copy link
Contributor Author

Looks like you didn't refresh your working branch, so it's a bit hard to read the diff.

Doh!

@gavbarnett
Copy link
Contributor Author

Looks like you didn't refresh your working branch, so it's a bit hard to read the diff.

Think I've sorted that mess out now.

Not sure how that happened. I thought I was up to date when I started, guess not. I did a diff on all the sections that it wasn't happy with and i'm convinced the code is the same.
The diff still doesn't look nice but I've honestly just copied code from the main function to the new functions.
yzhang-gh:master...gavbarnett:master


Now to look at the decorations manager and work out what to do with the cancel toekn.

@yzhang-gh
Copy link
Owner

I usually do this

git remote add upstream /url/to/original/repo # should already be okay, you can check with `git remote -v`
git fetch upstream
git checkout master
git merge upstream/master

This will keep your own changes.

Or you can use the GitHub UI (https://www.sitepoint.com/quick-tip-sync-your-fork-with-the-original-without-the-cli/)

@Lemmingh
Copy link
Collaborator

The problem is
@gavbarnett was working on master, and didn't hard reset his branch after the last PR was merged.

Three-way merge cannot update his branch. It needs detached checkout and then cherry pick.

I'll take care of it.

@Lemmingh
Copy link
Collaborator

what to do with the cancel token

You can also learn cooperative cancellation model at
https://docs.microsoft.com/en-us/dotnet/standard/threading/cancellation-in-managed-threads

@Lemmingh Lemmingh added this to the v3.6.0 milestone Nov 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: Auto completion Code completion and suggestion, aka IntelliSense. Area: Meta Pertaining to build system, test system, infrastructure, code health, and the project itself. Issue: Task Not a bug. Even not a new feature. But we really need to do something.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants