provideCompletionItems function is too big and requires splitting #896

gavbarnett · 2021-02-07T21:21:09Z

Proposal

provideCompletionItems is big at almost 200 lines, It could be broken down into 5 main sub-function (one per if-else branch).

In doing that some smaller functions may become clear for making common (file-scraping for links & image links might being one possibly).

The math functions have been mostly broken out already (mathEnvCheck / this.mathCompletions)

The main benefits would be:

readability, The if-else structure is easier to see
no need for large comment rectangles as using named functions

           /* ┌───────────────────────┐
              │ Reference link labels │
              └───────────────────────┘ */
                                //vs
            completionReferenceLinkLabels(args)

reduce code repetition
code testing

Moved all image path completion code from within the if-else statement to a seperate function as part of refactoring of provideCompletionItems(). yzhang-gh#896 No functional changes, purely structural. Tested by adding a local images to a .md file and confirming image link completion is still functional

gavbarnett · 2021-02-14T22:11:58Z

Just some thoughts for review.

Major function split

The following functions can be separated out from provideCompletionItems(document, position, _token, _context) in the first pass review of the large if-else statement.
Names can be reviewed, these all exist within MdCompletionItemProvider class

imageLinkCompletion()
mathCompletion()
linkReferenceLabelCompletion()
anchorTagsCompletion()
filePathsCompletion()

But this is a fairly simple view, but this should probably be the initial pull request to keep things clean. After that the following should be considered.

Discussion of functions

Some functions use promises like linkReferenceLabelCompletion() and - anchorTagsCompletion() but others do not. Seems like there should be a consistent approach.
1. Promises would seem to be useful for anything with a potentially large search such as imageLinkCompletion() & filePathsCompletion()
It seems like imageLinkCompletion() & filePathsCompletion() overlap significantly so could either be combined or call a common function.
1. Currently imageLinkCompletion() has handling for links with and without <> which filePathsCompletion() does not, but should.
2. They have slightly different handling for removing space's (%20) from uris. They get the same result so it's just implementation details.
3. imageLinkCompletion() limits search to only image file formats, as you would expect.
4. imageLinkCompletion() adds an image preview, as you would expect..
anchorTagsCompletion() has the following bracket completion code which no other function seems to do. It is also the only place where the variable lineTextAfter is used. If this is required then all functions should implement it using a common function.... but it's probably not needed.

    let addClosingParen = false;
    if (/^([^\) ]+\s*|^\s*)\)/.test(lineTextAfter)) {
        // try to detect if user wants to replace a link (i.e. matching closing paren and )
        // Either: ... <CURSOR> something <whitespace> )
        //     or: ... <CURSOR> <whitespace> )
        //     or: ... <CURSOR> )     (endPosition assignment is a no-op for this case)

        // in every case, we want to remove all characters after the cursor and before that first closing paren
        endPosition = position.with({ character: + endPosition.character + lineTextAfter.indexOf(')') });
    } else {
        // If no closing paren is found, replace all trailing non-white-space chars and add a closing paren
        // distance to first non-whitespace or EOL
        const toReplace = (lineTextAfter.search(/(?<=^\S+)(\s|$)/))
        endPosition = position.with({ character: + endPosition.character + toReplace });
        if (!linkRefDefinition) {
            addClosingParen = true;
        }
    }

Currently these functions need checked in a very specific order to operate correctly. It seems like a better regex would be able to avoid that.
1. Take all the link variations: [linkLabel], [linkLabel][], [linkLabel](link), [linkLabel](link#anchor), ![linkLabel], ![linkLabel][], ![linkLabel](link), [linkLabel]:(link), [linkLabel]:(link#anchor).. and others, could be identified with a single regex using named capture groups.
Completions, for the most part, shouldn't apply inside code blocks or quotes. There should be a common function to ignore these.
Other completions currently missing:
1. info string in fenced code blocks (typically the code language) (javascript, C, python...)
  - Technically info strings use is not specified in CM or GFM, instead it states:
    
    Although this spec doesn’t mandate any particular treatment of the info string, the first word is typically used to specify the language of the code block
  - And this seems fairly well supported
  - GFM uses linguist to render code blocks, it's list of languages is available here
2. Autolinks between < & >
  - GFM supports autolinks without < > but this would be harder to support. We'd have to recognize the path as valid
  - Worth noting that currently this extension doesn't render links inside < > as links.
3. Setext heading underlines to match heading length see issue 794
4. Tables
  - new rows could be a completion option based on previous/headers
  - I think more complex table editing is outside scope (Tools to create Tables #23, Table auto-formatting #100), and if this was added then this completion would be redundant.
5. Html tag completion
  - This could be a can of worms!
  - maybe start simple with:
    - comment tags `<--! comment -->

Lemmingh · 2021-02-15T18:33:11Z

My draft

Overview

+--------------------+
|       Engine       |
|                    |
| +----------------+ |
| | Document model | |
| +----------------+ |
|                    |
| +--------------+   |
| | Symbol table |   |
| +--------------+   |
+--------------------+
    |   ↑       |   ↑
    |   |       ↓   |
    |   |   +------------+   +---------+
    |   |   | Completion |←--| VS Code |
    |   |   |  provider  |--→| (User)  |
    |   |   +------------+   +---------+
    |   |       |   ↑
    ↓   |       ↓   |
+-------------------------------+
|           Providers           |
|                               |
| +---------------------+       |
| | Heading information |       |
| +---------------------+       |
|                               |
| +---------------------------+ |
| | Link reference definition | |
| +---------------------------+ |
|                               |
| +--------------------+        |
| | Workspace resource |        |
| +--------------------+        |
|                               |
| +----------------------+      |
| | Cross file reference |      |
| | and complex routing  |      |
| +----------------------+      |
|                               |
| +------+                      |
| | Math |                      |
| +------+                      |
+-------------------------------+

VS Code requests completion items.
Completion provider asks Engine for possible states at the cursor position.
Completion provider asks Providers for information.
Completion provider generates CompletionItems from each response, and merge results.
Completion provider returns the results to VS Code.

Notes

All systems should be implemented with cooperative cancellation support.
An item under Providers may not be single module, but a group of modules, even external things.
Engine is critical. From our experience, a huge number of requests will hit it, and take most of the time.

gavbarnett · 2021-02-15T20:55:33Z

Nice ASCII art! Much more rip the whole thing down kind of approach, but i understand your desire for a cleaner overall design.

How/when do you see the document model updating? Guessing it wouldn't update from scratch on each typed character but instead somehow update dynamically. Else it could get really slow on big files. Maybe you have a plan, or is this one of the tricky bits?

I guess you need to keep track of the line/char position of the entry & exit point for each document element. And have that quickly searchable.

gavbarnett · 2021-02-15T21:19:46Z

I guess the document model extends outside the completion item scope and would be used as a basis for many other functions (for example you get syntax highlighting and checking almost for free).

The only thing that you wouldn't get is "beautify" functionality for tables etc. But it would make scoping such things more straightforward.

yzhang-gh · 2021-02-16T02:51:40Z

Thanks.

@Lemmingh's design can be the long-term target which, as @gavbarnett said, goes somehow beyond the "completion" feature.

Specific to this issue, what we should do now is

The splitting of different types of completions ("image", "math", "link" etc.)

After splitting the main function only contains the code to determine which type of completions to show. Now we are using regexp, in the (far) future we can consult the Engine.
The unification of underlying "services" (or Providers)

e.g. "Workspace resource" is used by both imagePath and filePath,
"Heading information" is not only used by the "completion" feature but also the TOC.

By doing this, we pave the way for replacing the regexp-based solution with the Engine/Providers one in the future. (We don't have to do all these in one go.)

gavbarnett · 2021-02-16T06:31:01Z

May still be worth defining the interfaces a bit here before we start.

gavbarnett · 2021-02-17T23:29:26Z

Have made a start at this in my local master branch https://github.com/gavbarnett/vscode-markdown/blob/master/src/completion.ts

Just the simple step of splitting out the high level provider functions. Not sure I fully get how best to use the async cancel token (without changing too much else) but I've tried?

Still got the Math bit to do something with.
Then I'm going to extract the regex for each if-else as named variables so it's easier to read.

That's probably a good enough point to get to for the next Pull request.

Following that I'll look at extracting common provider parts.

yzhang-gh · 2021-02-18T01:58:52Z

You can open a draft PR so we can see the diff view and also may have some ideas 😉.

Lemmingh · 2021-02-18T06:33:04Z

Have made a start at this in my local master branch

yzhang-gh:388af61205180fa1e04565ca35f7e8f12cdde11b...gavbarnett:35f4fa4aca2030d958b22d81c6af98c10de5c789

Looks like you didn't refresh your working branch, so it's a bit hard to read the diff.

Not sure I fully get how best to use the async cancel token

You can take the decoration manager as a reference.
https://github.com/yzhang-gh/vscode-markdown/tree/388af61205180fa1e04565ca35f7e8f12cdde11b/src/theming

gavbarnett · 2021-02-18T10:57:47Z

Looks like you didn't refresh your working branch, so it's a bit hard to read the diff.

Doh!

gavbarnett · 2021-02-18T20:31:41Z

Looks like you didn't refresh your working branch, so it's a bit hard to read the diff.

Think I've sorted that mess out now.

Not sure how that happened. I thought I was up to date when I started, guess not. I did a diff on all the sections that it wasn't happy with and i'm convinced the code is the same.
The diff still doesn't look nice but I've honestly just copied code from the main function to the new functions.
yzhang-gh:master...gavbarnett:master

Now to look at the decorations manager and work out what to do with the cancel toekn.

yzhang-gh · 2021-02-19T03:07:39Z

I usually do this

git remote add upstream /url/to/original/repo # should already be okay, you can check with `git remote -v`
git fetch upstream
git checkout master
git merge upstream/master

This will keep your own changes.

Or you can use the GitHub UI (https://www.sitepoint.com/quick-tip-sync-your-fork-with-the-original-without-the-cli/)

Lemmingh · 2021-02-19T03:57:24Z

The problem is
@gavbarnett was working on master, and didn't hard reset his branch after the last PR was merged.

Three-way merge cannot update his branch. It needs detached checkout and then cherry pick.

I'll take care of it.

Lemmingh · 2021-02-19T04:09:36Z

what to do with the cancel token

You can also learn cooperative cancellation model at
https://docs.microsoft.com/en-us/dotnet/standard/threading/cancellation-in-managed-threads

gavbarnett mentioned this issue Feb 7, 2021

Reference link autocomplete #893

Closed

Lemmingh added Area: Auto completion Code completion and suggestion, aka IntelliSense. Area: Meta Pertaining to build system, test system, infrastructure, code health, and the project itself. Issue: Task Not a bug. Even not a new feature. But we really need to do something. labels Feb 8, 2021

gavbarnett mentioned this issue Feb 13, 2021

Add tests for completion.ts #901

Open

gavbarnett mentioned this issue Feb 14, 2021

✨ Improve auto-completion of links #898

Merged

gavbarnett linked a pull request Feb 18, 2021 that will close this issue

🏗 Split CompletionItemProvider into smaller modules #905

Draft

Lemmingh added this to the v3.6.0 milestone Nov 23, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

provideCompletionItems function is too big and requires splitting #896

provideCompletionItems function is too big and requires splitting #896

gavbarnett commented Feb 7, 2021

gavbarnett commented Feb 14, 2021

Lemmingh commented Feb 15, 2021

gavbarnett commented Feb 15, 2021 •

edited

Loading

gavbarnett commented Feb 15, 2021

yzhang-gh commented Feb 16, 2021 •

edited

Loading

gavbarnett commented Feb 16, 2021 •

edited

Loading

gavbarnett commented Feb 17, 2021

yzhang-gh commented Feb 18, 2021

Lemmingh commented Feb 18, 2021

gavbarnett commented Feb 18, 2021

gavbarnett commented Feb 18, 2021

yzhang-gh commented Feb 19, 2021

Lemmingh commented Feb 19, 2021

Lemmingh commented Feb 19, 2021

provideCompletionItems function is too big and requires splitting #896

provideCompletionItems function is too big and requires splitting #896

Comments

gavbarnett commented Feb 7, 2021

Proposal

gavbarnett commented Feb 14, 2021

Major function split

Discussion of functions

Lemmingh commented Feb 15, 2021

Overview

Notes

gavbarnett commented Feb 15, 2021 • edited Loading

gavbarnett commented Feb 15, 2021

yzhang-gh commented Feb 16, 2021 • edited Loading

gavbarnett commented Feb 16, 2021 • edited Loading

gavbarnett commented Feb 17, 2021

yzhang-gh commented Feb 18, 2021

Lemmingh commented Feb 18, 2021

gavbarnett commented Feb 18, 2021

gavbarnett commented Feb 18, 2021

yzhang-gh commented Feb 19, 2021

Lemmingh commented Feb 19, 2021

Lemmingh commented Feb 19, 2021

gavbarnett commented Feb 15, 2021 •

edited

Loading

yzhang-gh commented Feb 16, 2021 •

edited

Loading

gavbarnett commented Feb 16, 2021 •

edited

Loading