You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Solve the problem of delay caused by LSP compilation speed in large-scale scenarios. The goal is to reduce the response time of various requests and notifications to less than 50 ms.
/// Get compile uint(files and options) from a single filepubfnlookup_compile_unit(file:&str,load_pkg:bool,) -> (Vec<String>,Option<LoadProgramOptions>){}
lookup_compile_unit will lookup compiled configuration files (kcl.yaml, kcl.mod) starting from the file being edited. LSP caches the results of lookup_compile_unit and clears the cache when the configuration file changes.
Because the reconstruction cost will not be particularly large, about 100ms each time. Therefore, clearing it all every time when edit config files
lsp lookup_compile_unit cache implementation #1188
Add a watcher to the configuration file on the client side kcl-lang/vscode-kcl#41
todo:
The implementation of #1188 relies on the DidChangeWatchedFiles event. This capability is the capability of the FIle Watcher provided by VSCode. For other IDEs, this capability may not be available, so the server may not receive the notification. It is necessary to implement the complete config FIle watcher capability on the server side instead of relying on client notification..
AdvanceResolver Cache
Currently, LSP will fully execute AdvanceResolver each time, which mainly consists of two parts: Namer::find_symbols() and AdvancedResolver::resolve_program(). Finally, a GlobalState will be generated, which contains semantic information and is saved in the db for analysis of requests. It takes about 180 ms in total.
For the following scenario, the two files main1.k and main2.k both depend on base.k. This is also a common scenario where different configuration definitions dependences on a common schema model.
main1.k
importbase
main2.k
importbase
base/base.k
schemaX:
name: str
we need to solve
Edit the file main1.k, need to cache base.k, and recompile main1.k. most important needs
After compiling main1.k, cache base.k and incrementally compile main2.k
(Option, maybe not need todo) Edit base.k and update the information of main1.k and main2.k
Optimization of walkers in Namer and AdvanceResolver
The most important part of the Resolver result, scope_map, in ProgramScope, is the hashmap with pkg name as the key.
When the resolver is compiled incrementally, it takes the main package as the entr, analyzes the update and affected pkgs and removes them from the scope map. Then re-walk these pkgs and generate new Scope.
For Namer::find_symbols() and AdvancedResolver::resolve_program(), we can also update only the invalid (update and affected by update)pkg instead of traversing all pkgs every time.
In Resolver, record the invalid pkg and pass it to Namer and AdvancedResolver
In GlobalState, the semantic information in the invalid pkg needs to be cleared, otherwise it will cause overlap. scopes, packages and sema_db are all information saved with pkg or file as index. We can clear the cache based on the invalid pkg name. But symbols is a global symbols set. Need to add pkg -> HashSet map. When adding a symbol, record the pkg to which the symbol belongs.
Before Namer::find_symbols(), clear the cache. The order of clearing here should be opposite to the order of build (symbols clear at the end)
In addition to the invalid pkg, Namer::find_symbols() and AdvancedResolver::resolve_program() also need to walk the new pkg
main pkg name and GlobalState in-place change
kcl's Program takes main pkg as the entry point and analyzes other dependent pkgs
Because all programs share the name main. In the second scenario,if:
open main1.k -> open main2.k -> swtich main1.k -> request
Calculating the semantic information of main2.k will overwrite main1.k, and then switch back to main1.k in the IDE. Because there are no file changes, main1.k will not be recompiled. At this time, gs saves the semantics of main2.k, so analysis of main1.k will fail
Therefore, globalstate has no way to save the main pkg information of each compilation entry at the same time. In the previous processing, after compilation, gs is cloned and stored in db. When handling the request, the corresponding gs is obtained in db according to the entry.
The clone of gs takes about 20 - 30ms. gs can be modified in place and unified globally to replace this clone. We need:
Provide the ability to configure different main packages. main1.k and main2.k need different spaces in gs
Use mutable references of gs in Namer and AdvancedResolver and modify them in place. Only use one gs globally
Reverse dependent updates
todo
Case3: Edit base.k and update the information of main1.k and main2.k
Other optimizations
Pay attention to the clone of big structures, use Rc/Arc clone instead, such as db, node_ty_map #1174
The text was updated successfully, but these errors were encountered:
I would like to know how IDEs for other languages such as Python and typescript handle incremental compilation, as Python and KCL have similar syntax and semantics, and how the IDE for Rust handles incremental compilation because they are similar in implementation.
lookup_compile_unit 321,993 117,994 87,132
I think the time unit for testing should be clearly marked.
For Namer::find_symbols() and AdvancedResolver::resolve_program(), we can also update only the invalid pkg instead of traversing all pkgs every time.
What does the invalid pkg means?
In addition to the invalid pkg, Namer::find_symbols() and AdvancedResolver::resolve_program() also need to walk the new pkg
What is the calculation logic for __main1__ and `main2, how to handle multi file compilation entries, and how to handle non main package entry file IDEs? Will they conflict? How about abstract package path definition.
pubenumPackagePath{Logic(<Some md5 hash>)Real(<real filepath or pkg path>)}
Background
Solve the problem of delay caused by LSP compilation speed in large-scale scenarios. The goal is to reduce the response time of various requests and notifications to less than 50 ms.
Bench
compile about 300-400 kcl files
kclvm/tools/src/LSP/src/util.rs
Optimization Solution
lookup_compile_unit cache
lookup_compile_unit will lookup compiled configuration files (kcl.yaml, kcl.mod) starting from the file being edited. LSP caches the results of lookup_compile_unit and clears the cache when the configuration file changes.
Because the reconstruction cost will not be particularly large, about 100ms each time. Therefore, clearing it all every time when edit config files
lsp lookup_compile_unit cache implementation
#1188
Add a watcher to the configuration file on the client side
kcl-lang/vscode-kcl#41
todo:
The implementation of #1188 relies on the DidChangeWatchedFiles event. This capability is the capability of the FIle Watcher provided by VSCode. For other IDEs, this capability may not be available, so the server may not receive the notification. It is necessary to implement the complete config FIle watcher capability on the server side instead of relying on client notification..
AdvanceResolver Cache
Currently, LSP will fully execute AdvanceResolver each time, which mainly consists of two parts:
Namer::find_symbols()
andAdvancedResolver::resolve_program()
. Finally, a GlobalState will be generated, which contains semantic information and is saved in the db for analysis of requests. It takes about 180 ms in total.For the following scenario, the two files main1.k and main2.k both depend on base.k. This is also a common scenario where different configuration definitions dependences on a common schema model.
main1.k
main2.k
base/base.k
we need to solve
main1.k
, need to cachebase.k
, and recompilemain1.k
. most important needsmain1.k
, cache base.k and incrementally compilemain2.k
main1.k
andmain2.k
Optimization of walkers in Namer and AdvanceResolver
The most important part of the Resolver result, scope_map, in ProgramScope, is the hashmap with pkg name as the key.
When the resolver is compiled incrementally, it takes the main package as the entr, analyzes the update and affected pkgs and removes them from the scope map. Then re-walk these pkgs and generate new Scope.
For Namer::find_symbols() and AdvancedResolver::resolve_program(), we can also update only the invalid (update and affected by update)pkg instead of traversing all pkgs every time.
scopes
,packages
andsema_db
are all information saved with pkg or file as index. We can clear the cache based on the invalid pkg name. But symbols is a global symbols set. Need to add pkg -> HashSet map. When adding a symbol, record the pkg to which the symbol belongs.main pkg name and GlobalState in-place change
kcl's Program takes main pkg as the entry point and analyzes other dependent pkgs
Because all programs share the name
main
. In the second scenario,if:open main1.k -> open main2.k -> swtich main1.k -> request
Calculating the semantic information of main2.k will overwrite main1.k, and then switch back to main1.k in the IDE. Because there are no file changes, main1.k will not be recompiled. At this time, gs saves the semantics of main2.k, so analysis of main1.k will fail
Therefore, globalstate has no way to save the main pkg information of each compilation entry at the same time. In the previous processing, after compilation, gs is cloned and stored in db. When handling the request, the corresponding gs is obtained in db according to the entry.
The clone of gs takes about 20 - 30ms. gs can be modified in place and unified globally to replace this clone. We need:
Reverse dependent updates
todo
Case3: Edit base.k and update the information of
main1.k
andmain2.k
Other optimizations
Pay attention to the clone of big structures, use Rc/Arc clone instead, such as db, node_ty_map
#1174
The text was updated successfully, but these errors were encountered: