Skip to content

Latest commit

 

History

History
93 lines (70 loc) · 2.44 KB

README.md

File metadata and controls

93 lines (70 loc) · 2.44 KB

Site RAG

Demo video

Screenshot of Site RAG Chrome extension

A Chrome extension for asking questions over websites. Site RAG can index a single page of the website, or crawl the entire site. It then generates embeddings for the indexed documents, and stores them in a vector store database.

When a user asks a question, Site RAG will either fetch relevant documents from the current page, or the entire site (customizable).

Requirements

Setup

First, clone the repository:

git clone https://github.com/bracesproul/site-rag.git
cd site-rag

Then, install the dependencies:

yarn install

and build:

yarn build

Vector store

To setup the vector store, you need to create a Supabase database. Then, inside the SQL editor, run the following:

-- Enable the pgvector extension to work with embedding vectors
create extension vector;

-- Create a table to store your documents
create table documents (
  id bigserial primary key,
  content text, -- corresponds to Document.pageContent
  metadata jsonb, -- corresponds to Document.metadata
  embedding vector(3072) -- 3072 works for OpenAI embeddings, change if needed
);

-- Create a function to search for documents
create function match_documents (
  query_embedding vector(3072),
  match_count int DEFAULT null,
  filter jsonb DEFAULT '{}'
) returns table (
  id bigint,
  content text,
  metadata jsonb,
  embedding jsonb,
  similarity float
)
language plpgsql
as $$
#variable_conflict use_column
begin
  return query
  select
    id,
    content,
    metadata,
    (embedding::text)::jsonb as embedding,
    1 - (documents.embedding <=> query_embedding) as similarity
  from documents
  where metadata @> filter
  order by documents.embedding <=> query_embedding
  limit match_count;
end;
$$;

Usage

To use the extension, go to chrome://extensions/ and click "Load unpacked". From there, select the dist directory of this repository.

Once loaded, open the extension and visit the settings page. Here you can add your API keys, and Supabase credentials. You can also customize the indexing settings, such as chunk size and overlap.