This repository has been archived by the owner on Oct 30, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 15
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
69e7d16
commit 80c7bf2
Showing
19 changed files
with
86 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
{ | ||
"label": "Ingestion and Retrieval Flows" | ||
} | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,48 @@ | ||
--- | ||
title: Usage | ||
--- | ||
|
||
# Using Knowledge | ||
|
||
The knowledge tool itself has two modes of operation: Standalone and Server Mode - Check the sections below to learn more about them. | ||
|
||
Both modes are configured the same way, via environment variables or command line flags: | ||
|
||
## Configuration | ||
|
||
### Embedding Model Provider (must have) | ||
|
||
The model provider is the provider of the embeddings model that is used to encode ingested documents. | ||
Currently, we only support **OpenAI** and **Azure OpenAI** via the following flags / environment variables: | ||
|
||
```bash | ||
--openai-api-base string OpenAI API base ($OPENAI_BASE_URL) (default "https://api.openai.com/v1") | ||
--openai-api-key string OpenAI API key ($OPENAI_API_KEY) (default "sk-foo") | ||
--openai-api-type string OpenAI API type (OPEN_AI, AZURE, AZURE_AD) ($OPENAI_API_TYPE) (default "OPEN_AI") | ||
--openai-api-version string OpenAI API version (for Azure) ($OPENAI_API_VERSION) (default "2024-02-01") | ||
--openai-azure-deployment string Azure OpenAI deployment name (overrides openai-embedding-model, if set) ($OPENAI_AZURE_DEPLOYMENT) | ||
--openai-embedding-model string OpenAI Embedding model ($OPENAI_EMBEDDING_MODEL) (default "text-embedding-ada-002") | ||
``` | ||
Those are persistent flags, so they can be set on any knowledge subcommand. | ||
## 1. Standalone Mode (Default) | ||
In standalone mode, Knowledge makes use of an embedded database and embedded Vector Database which the client connects to directly. | ||
This is the default and most simple mode of operation and is useful for local usage and offers a great integration with GPTScript. | ||
### Run the Client | ||
Any `knowledge` command (except for `knowledge server`) will use the standalone client mode, if no `KNOW_SERVER_URL` environment variable is set. | ||
## 2. Server Mode | ||
In server mode, Knowledge uses a separate server for the Vector Database and the Document Database. | ||
This mode is useful when you want to share the data with multiple clients or when you want to use a more powerful server for the Vector Database. | ||
### Run the Server | ||
```bash | ||
knowledge server | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,33 @@ | ||
--- | ||
title: Architecture | ||
--- | ||
|
||
# Knowledge Architecture | ||
|
||
![Knowledge Architecture](/img/knowledge_architecture.png) | ||
|
||
Knowledge consists of the following components: | ||
|
||
## 1. Knowledge Client | ||
|
||
The Knowledge Client is the main interface to interact with your knowledge bases. | ||
In standalone mode, it makes direct use of embedded databases. It's running fully locally. | ||
It's also the default entrypoint for the CLI. | ||
|
||
## 2. Knowledge Server [Optional] | ||
|
||
The Knowledge Server is a REST API server that can be used to provide a (shared) HTTP Endpoint for your knowledge bases. | ||
You can make use of it in the CLI by setting the `KNOW_SERVER_URL` environment variables for all client commands. | ||
|
||
## 3. Index Database | ||
|
||
The index database is an additional (relational) metadata database which keeps track of all datasets and ingested files and their relationships. | ||
It enables some extra convenience features but does not store the actual data (embeddings). | ||
The current implementation uses **SQLite**. | ||
It's fully embedded and does not require any additional setup. | ||
|
||
## 4. Vector Database | ||
|
||
The vector database is the main storage for the embeddings of the ingested documents along with some metadata (e.g. source file information). | ||
The current implementation uses [**chromem-go**](https://github.com/philippgille/chromem-go). | ||
It's fully embedded and does not require any additional setup. |
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.