diff --git a/README.md b/README.md index 3e512dd..c7a9458 100644 --- a/README.md +++ b/README.md @@ -244,8 +244,7 @@ Each type of node owns a set of properties which affect the computational perfor * `Input Prefix`: append before every user input * `Input Suffix`: append after every user input * `Should Output prompt`: whether the input prompt should be included in the output -* `Should Output Bos`: whether the special bos (beginning of sequence) token should be included in the output -* `Should Output Eos`: whether the special eos (ending of sequence) token should be included in the output +* `Should Output Special`: whether the special (e.g., beginning of sequence and ending of sequence) token should be included in the output * `Context Size`: number of tokens the model can process at a time * `N Predict`: number of new tokens to generate, generate infinite sequence if -1 * `N Keep`: when the model run out of `context size`, it starts to forget about earlier context, set this variable to force the model to keep a number of the earliest tokens to keep the conversation relevant @@ -258,6 +257,8 @@ Each type of node owns a set of properties which affect the computational perfor * `Min P`: only sample from tokens with at least this probability, disabledd if 0.0 * `N Thread`: number of cpu threads to use * `N GPU Layer`: number of layer offloaded to GPU +* `Main GPU`: the main GPU for computation +* `Split Mode`: how the computation will be distributed if there are multiple GPU in your systemm (0: None, 1: Layer, 2: Row) * `Escape`: process escape character in input prompt * `N Batch`: maximum number of tokens per iteration during continuous batching * `N Ubatch`: maximum batch size for computation @@ -315,11 +316,10 @@ Suppose you want to generate a character with: * `weapon`: either "sword", "bow", or "wand * `description`: a text with minimum 10 character -You should first create a GDLlama node, and turn `Should Output prompt`, `Should Output Bos`, and `Should Output Eos` off either by inspector or by script: +You should first create a GDLlama node, and turn `Should Output prompt` and `Should Output Special` off either by inspector or by script: ``` should_output_prompt = false -should_output_bos = false -should_output_eos = false +should_output_special = false ``` @@ -408,15 +408,16 @@ Besides the functions and signals from GDEmbedding, LlmDB has a few more functio * `has_id(id: String, p_table_name: String) -> bool`: whether the table has a specific id stored * `split_text(text: String) -> PackedStringArray`: split a piece of text first by all `Absolute Separators`, then by one of the appropiate `Chunk Separators`, such that any text chunk is shorter than `Chunk Size` (measured in character), and the overlap is close to but not greater than `Chunk Overlap`. If the algorithm failed to satisfy the contraints, there will be an error message printed out and the returned chunk will be greater than the `Chunk Size` * `store_text_by_id(id: String, text: String)`: split the text and store the chunks in the database, be aware that `store_meta` should have been called previously such that the `id` with the corresponding meta is already in the database -* `run_store_text_by_id(id: String, text: String) -> Error`: run `store_text_by_id` in background +* `run_store_text_by_id(id: String, text: String) -> Error`: run `store_text_by_id` in background, emits `store_text_finished` signal when finished * `store_text_by_meta(meta_dict: Dictionary, text: String)`: split the text and store the chunks in the database with the metadata defined in `meta_dict`, be aware that the metadata should be valid, every key should be a name stored in the `.meta` property and the corresponding type should be correct -* `run_store_text_by_meta(meta_dict: Dictionary, text: String) -> Error` run `store_text_by_meta` in background +* `run_store_text_by_meta(meta_dict: Dictionary, text: String) -> Error` run `store_text_by_meta` in background, emits `store_text_finished` signal when finished * `retrieve_similar_texts(text: String, where: String, n_results: int) -> PackedStringArray`: retrieve `n_results` most similar text chunks to `text`, `where` should be empty or an sql WHERE clause to filter the chunks by metadata * `run_retrieve_similar_texts(text: String, where: String, n_results: int) -> Error`: run `retrieve_similar_texts` in background, and emits a `retrieve_similar_texts_finished` signal once it is done ### Signals +* `store_text_finished`: emitted when `run_store_text_by_id` or `run_store_text_by_meta` is finished * `retrieve_similar_texts_finished(array: PackedStringArray)`: contains an array of `String`, emitted when `run_retrieve_similar_texts` is finished ## LlmDBMetaData @@ -436,14 +437,26 @@ There are 4 static functions to create LlmDBMetaData # FAQ -1. Does it support languages other than English? +1. You have an issue, how to get more debug message? + +Turn on `Verbose stdout` in `Project Settings`, consider running Godot from a terminal to get additional logging messages. + +2. Does it support languages other than English? Yes, the plugin uses utf8 encoding so it has multilingual support naturally. However, a language model may be trained with English data only and it won't be able to generate text other than English, choose the language model based on your need. -2. Observing strange tokens in generated text, such as `` when `Should Output Eos` is off. +3. Observing strange tokens in generated text, such as `` when `Should Output Special` is off. You are always welcome to open an issue. However, be aware that the standard of GGUF format can be changed to support new features and models, such that the bug can come from the model side instead of within this plugin. For example, some older llama 3 GGUF model may not be compatible with the latest format, you may try to search for a newer model with fixes such as [this](https://huggingface.co/NikolayKozloff/Meta-Llama-3-8B-Instruct-bf16-correct-pre-tokenizer-and-EOS-token-Q8_0-Q6_k-Q4_K_M-GGUF/tree/main). +4. You are runningg Arch linux (or its derivatives such as Manjaro) and your Godot Editor crash. + +The Arch build of Godot is bugged, download Godot from the official website instead. + +5. You have a discrete GPU and you see `unable to load model` error, you have make sure that the model parameters are correctly set. + +There is currently a bug on vulkan backend if you have multiple drivers installed for the same GPU, try to turn `Split Mode` to `NONE` (0) and set your `Main GPU` manually (starting from 0) to see if it works. + # Compile from source Install build tools and Vulkan SDK for your operating system, then clone this repository ```