- Copy .env.example to .env and modify content
- Create ngrok domain
- Setup ngrok agent auth
- Setup google access to llm and add keys to .env
- Setup langsmith in .env
- Start everything as a docker compose with code hot reload:
docker compose --env-file .env -p aishe_ai up
- Copy .env.example to .env and modify content
- Install
tesseract-ocr
for your system with apt etc - Install python deps:
pip3 install -r requirements.txt
or update currentpip install -r requirements.txt --upgrade
- Install chromium
pip install -q playwright beautifulsoup4 playwright install
- Create ngrok domain
- Install ngrok
- Setup ngrok agent auth
- Setup google access to llm and add keys to .env
- Setup langsmith in .env
- Start fastapi:
uvicorn app:app --reload
- Start ngrok:
ngrok http --domain=DOMAIN 8000
, domain must be the same as the bot creation
- Browser is not starting for webscraping, for example within the webpage_tool:
- add to the browser launch parameters:
args=["--disable-gpu"]
->browser = await p.chromium.launch(headless=True, args=["--disable-gpu"])
- only observed with wsl2 systems
- add to the browser launch parameters:
black FOLDER_NAME
tbd
Public image repo
docker run -d -p 80:80 --env-file .env aishe-ai
For prompts regarding internal company data, which will the regulary be scraped. When user prompts system, following will happen:
- get member from given email (search)
- get memberships from member (join)
- get documents from memberships (join)
- iterate over accessable documents and add their embeddings into the vector space for similarity search with given user prompt
erDiagram
organizations ||--|{ data_sources : belongs_to
organizations ||--|{ members : belongs_to
data_sources ||--|{ documents : belongs_to
members ||--|| memberships : belongs_to
data_sources ||--|| memberships : belongs_to
documents ||--|| memberships : belongs_to
organizations {
uuid uuid PK
name string
description string
}
data_sources {
uuid uuid PK
name text
description text
bot_auth_data jsonb
organization_uuid uuid FK
}
members {
uuid uuid PK
email text
name text
organization_uuid uuid FK
}
documents {
uuid uuid PK
data_source_uuid uuid FK
name text
description text
url text
metadata jsonb
embeddings vector[]
content text
}
memberships {
uuid uuid PK
data_source_role text
data_source_uuid uuid FK
namespace_user_name text
member_uuid uuid FK
document_uuid uuid FK
}
erDiagram
langchain_pg_collection ||--o{ langchain_pg_embedding : belongs_to
langchain_pg_collection {
uuid uuid PK
name varchar()
cmetadata json
}
langchain_pg_embedding {
uuid uuid PK
embedding vector
document varchar()
cmetadata json
custom_id varchar()
collection_id uuid FK
}