Skip to content
This repository has been archived by the owner on Oct 30, 2024. It is now read-only.

Commit

Permalink
add: more docs
Browse files Browse the repository at this point in the history
  • Loading branch information
iwilltry42 committed Jul 10, 2024
1 parent 69e7d16 commit 80c7bf2
Show file tree
Hide file tree
Showing 19 changed files with 86 additions and 1 deletion.
4 changes: 4 additions & 0 deletions docs/docs/01-datasets/_category_.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
{
"label": "Ingestion and Retrieval Flows"
}

48 changes: 48 additions & 0 deletions docs/docs/02-flows/02-usage.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
---
title: Usage
---

# Using Knowledge

The knowledge tool itself has two modes of operation: Standalone and Server Mode - Check the sections below to learn more about them.

Both modes are configured the same way, via environment variables or command line flags:

## Configuration

### Embedding Model Provider (must have)

The model provider is the provider of the embeddings model that is used to encode ingested documents.
Currently, we only support **OpenAI** and **Azure OpenAI** via the following flags / environment variables:

```bash
--openai-api-base string OpenAI API base ($OPENAI_BASE_URL) (default "https://api.openai.com/v1")
--openai-api-key string OpenAI API key ($OPENAI_API_KEY) (default "sk-foo")
--openai-api-type string OpenAI API type (OPEN_AI, AZURE, AZURE_AD) ($OPENAI_API_TYPE) (default "OPEN_AI")
--openai-api-version string OpenAI API version (for Azure) ($OPENAI_API_VERSION) (default "2024-02-01")
--openai-azure-deployment string Azure OpenAI deployment name (overrides openai-embedding-model, if set) ($OPENAI_AZURE_DEPLOYMENT)
--openai-embedding-model string OpenAI Embedding model ($OPENAI_EMBEDDING_MODEL) (default "text-embedding-ada-002")
```
Those are persistent flags, so they can be set on any knowledge subcommand.
## 1. Standalone Mode (Default)
In standalone mode, Knowledge makes use of an embedded database and embedded Vector Database which the client connects to directly.
This is the default and most simple mode of operation and is useful for local usage and offers a great integration with GPTScript.
### Run the Client
Any `knowledge` command (except for `knowledge server`) will use the standalone client mode, if no `KNOW_SERVER_URL` environment variable is set.
## 2. Server Mode
In server mode, Knowledge uses a separate server for the Vector Database and the Document Database.
This mode is useful when you want to share the data with multiple clients or when you want to use a more powerful server for the Vector Database.
### Run the Server
```bash
knowledge server
```
33 changes: 33 additions & 0 deletions docs/docs/02-flows/03-architecture.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
---
title: Architecture
---

# Knowledge Architecture

![Knowledge Architecture](/img/knowledge_architecture.png)

Knowledge consists of the following components:

## 1. Knowledge Client

The Knowledge Client is the main interface to interact with your knowledge bases.
In standalone mode, it makes direct use of embedded databases. It's running fully locally.
It's also the default entrypoint for the CLI.

## 2. Knowledge Server [Optional]

The Knowledge Server is a REST API server that can be used to provide a (shared) HTTP Endpoint for your knowledge bases.
You can make use of it in the CLI by setting the `KNOW_SERVER_URL` environment variables for all client commands.

## 3. Index Database

The index database is an additional (relational) metadata database which keeps track of all datasets and ingested files and their relationships.
It enables some extra convenience features but does not store the actual data (embeddings).
The current implementation uses **SQLite**.
It's fully embedded and does not require any additional setup.

## 4. Vector Database

The vector database is the main storage for the embeddings of the ingested documents along with some metadata (e.g. source file information).
The current implementation uses [**chromem-go**](https://github.com/philippgille/chromem-go).
It's fully embedded and does not require any additional setup.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
2 changes: 1 addition & 1 deletion docs/gendocs/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ func main() {
}
}

err = doc.GenMarkdownTreeCustom(cmd, "docs/docs/03-cmd", filePrepender, linkHandler)
err = doc.GenMarkdownTreeCustom(cmd, "docs/docs/99-cmd", filePrepender, linkHandler)
if err != nil {
log.Fatal(err)
}
Expand Down
Binary file added docs/static/img/knowledge_architecture.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 80c7bf2

Please sign in to comment.