add: more docs

gptscript-ai · Jul 10, 2024 · 80c7bf2 · 80c7bf2
1 parent 69e7d16
commit 80c7bf2
Show file tree

Hide file tree

Showing 19 changed files with 86 additions and 1 deletion.
diff --git a/docs/docs/01-datasets/_category_.json b/docs/docs/01-datasets/_category_.json
@@ -0,0 +1,4 @@
+{
+  "label": "Ingestion and Retrieval Flows"
+}
+
diff --git a/docs/docs/02-flows/02-usage.md b/docs/docs/02-flows/02-usage.md
@@ -0,0 +1,48 @@
+---
+title: Usage
+---
+
+# Using Knowledge
+
+The knowledge tool itself has two modes of operation: Standalone and Server Mode - Check the sections below to learn more about them.
+
+Both modes are configured the same way, via environment variables or command line flags:
+
+## Configuration
+
+### Embedding Model Provider (must have)
+
+The model provider is the provider of the embeddings model that is used to encode ingested documents.
+Currently, we only support **OpenAI** and **Azure OpenAI** via the following flags / environment variables:
+
+```bash
+--openai-api-base string           OpenAI API base ($OPENAI_BASE_URL) (default "https://api.openai.com/v1")
+--openai-api-key string            OpenAI API key ($OPENAI_API_KEY) (default "sk-foo")
+--openai-api-type string           OpenAI API type (OPEN_AI, AZURE, AZURE_AD) ($OPENAI_API_TYPE) (default "OPEN_AI")
+--openai-api-version string        OpenAI API version (for Azure) ($OPENAI_API_VERSION) (default "2024-02-01")
+--openai-azure-deployment string   Azure OpenAI deployment name (overrides openai-embedding-model, if set) ($OPENAI_AZURE_DEPLOYMENT)
+--openai-embedding-model string    OpenAI Embedding model ($OPENAI_EMBEDDING_MODEL) (default "text-embedding-ada-002")
+```
+
+Those are persistent flags, so they can be set on any knowledge subcommand.
+
+
+## 1. Standalone Mode (Default)
+
+In standalone mode, Knowledge makes use of an embedded database and embedded Vector Database which the client connects to directly.
+This is the default and most simple mode of operation and is useful for local usage and offers a great integration with GPTScript.
+
+### Run the Client
+
+Any `knowledge` command (except for `knowledge server`) will use the standalone client mode, if no `KNOW_SERVER_URL` environment variable is set.
+
+## 2. Server Mode
+
+In server mode, Knowledge uses a separate server for the Vector Database and the Document Database.
+This mode is useful when you want to share the data with multiple clients or when you want to use a more powerful server for the Vector Database.
+
+### Run the Server
+
+```bash
+knowledge server
+```
diff --git a/docs/docs/02-flows/03-architecture.md b/docs/docs/02-flows/03-architecture.md
@@ -0,0 +1,33 @@
+---
+title: Architecture
+---
+
+# Knowledge Architecture
+
+![Knowledge Architecture](/img/knowledge_architecture.png)
+
+Knowledge consists of the following components:
+
+## 1. Knowledge Client
+
+The Knowledge Client is the main interface to interact with your knowledge bases.
+In standalone mode, it makes direct use of embedded databases. It's running fully locally.
+It's also the default entrypoint for the CLI.
+
+## 2. Knowledge Server [Optional]
+
+The Knowledge Server is a REST API server that can be used to provide a (shared) HTTP Endpoint for your knowledge bases.
+You can make use of it in the CLI by setting the `KNOW_SERVER_URL` environment variables for all client commands.
+
+## 3. Index Database
+
+The index database is an additional (relational) metadata database which keeps track of all datasets and ingested files and their relationships.
+It enables some extra convenience features but does not store the actual data (embeddings).
+The current implementation uses **SQLite**.
+It's fully embedded and does not require any additional setup.
+
+## 4. Vector Database
+
+The vector database is the main storage for the embeddings of the ingested documents along with some metadata (e.g. source file information).
+The current implementation uses [**chromem-go**](https://github.com/philippgille/chromem-go).
+It's fully embedded and does not require any additional setup.
diff --git a/docs/docs/03-cmd/_category_.json → docs/docs/99-cmd/_category_.json b/docs/docs/03-cmd/_category_.json → docs/docs/99-cmd/_category_.json
diff --git a/docs/docs/03-cmd/knowledge.md → docs/docs/99-cmd/knowledge.md b/docs/docs/03-cmd/knowledge.md → docs/docs/99-cmd/knowledge.md
diff --git a/docs/docs/03-cmd/knowledge_askdir.md → docs/docs/99-cmd/knowledge_askdir.md b/docs/docs/03-cmd/knowledge_askdir.md → docs/docs/99-cmd/knowledge_askdir.md
diff --git a/docs/docs/03-cmd/knowledge_create-dataset.md → docs/docs/99-cmd/knowledge_create-dataset.md b/docs/docs/03-cmd/knowledge_create-dataset.md → docs/docs/99-cmd/knowledge_create-dataset.md
diff --git a/docs/docs/03-cmd/knowledge_delete-dataset.md → docs/docs/99-cmd/knowledge_delete-dataset.md b/docs/docs/03-cmd/knowledge_delete-dataset.md → docs/docs/99-cmd/knowledge_delete-dataset.md
diff --git a/docs/docs/03-cmd/knowledge_edit-dataset.md → docs/docs/99-cmd/knowledge_edit-dataset.md b/docs/docs/03-cmd/knowledge_edit-dataset.md → docs/docs/99-cmd/knowledge_edit-dataset.md
diff --git a/docs/docs/03-cmd/knowledge_export.md → docs/docs/99-cmd/knowledge_export.md b/docs/docs/03-cmd/knowledge_export.md → docs/docs/99-cmd/knowledge_export.md
diff --git a/docs/docs/03-cmd/knowledge_get-dataset.md → docs/docs/99-cmd/knowledge_get-dataset.md b/docs/docs/03-cmd/knowledge_get-dataset.md → docs/docs/99-cmd/knowledge_get-dataset.md
diff --git a/docs/docs/03-cmd/knowledge_import.md → docs/docs/99-cmd/knowledge_import.md b/docs/docs/03-cmd/knowledge_import.md → docs/docs/99-cmd/knowledge_import.md
diff --git a/docs/docs/03-cmd/knowledge_ingest.md → docs/docs/99-cmd/knowledge_ingest.md b/docs/docs/03-cmd/knowledge_ingest.md → docs/docs/99-cmd/knowledge_ingest.md
diff --git a/docs/docs/03-cmd/knowledge_list-datasets.md → docs/docs/99-cmd/knowledge_list-datasets.md b/docs/docs/03-cmd/knowledge_list-datasets.md → docs/docs/99-cmd/knowledge_list-datasets.md
diff --git a/docs/docs/03-cmd/knowledge_retrieve.md → docs/docs/99-cmd/knowledge_retrieve.md b/docs/docs/03-cmd/knowledge_retrieve.md → docs/docs/99-cmd/knowledge_retrieve.md
diff --git a/docs/docs/03-cmd/knowledge_server.md → docs/docs/99-cmd/knowledge_server.md b/docs/docs/03-cmd/knowledge_server.md → docs/docs/99-cmd/knowledge_server.md
diff --git a/docs/docs/03-cmd/knowledge_version.md → docs/docs/99-cmd/knowledge_version.md b/docs/docs/03-cmd/knowledge_version.md → docs/docs/99-cmd/knowledge_version.md
diff --git a/docs/gendocs/main.go b/docs/gendocs/main.go
@@ -31,7 +31,7 @@ func main() {
 		}
 	}
 
-	err = doc.GenMarkdownTreeCustom(cmd, "docs/docs/03-cmd", filePrepender, linkHandler)
+	err = doc.GenMarkdownTreeCustom(cmd, "docs/docs/99-cmd", filePrepender, linkHandler)
 	if err != nil {
 		log.Fatal(err)
 	}

diff --git a/docs/static/img/knowledge_architecture.png b/docs/static/img/knowledge_architecture.png