Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: Database Schema #9

Merged
merged 38 commits into from
Sep 13, 2024

Conversation

shiv810
Copy link
Collaborator

@shiv810 shiv810 commented Sep 9, 2024

Resolves #8

  • Updated migrations for the database
  • Supports NULL comment body for private repository.
  • Has timestamps now for creation and modification.
  • PKey is Github's global node id. No need for Organization or Repo Name for referencing it.
  • Schema has author_id now.

@shiv810 shiv810 changed the base branch from development to main September 9, 2024 05:34
src/handlers/add-comments.ts Outdated Show resolved Hide resolved
src/types/database.ts Show resolved Hide resolved
tests/__mocks__/strings.ts Outdated Show resolved Hide resolved
src/adapters/supabase/helpers/comment.ts Outdated Show resolved Hide resolved
src/adapters/supabase/helpers/comment.ts Outdated Show resolved Hide resolved
@0x4007
Copy link
Member

0x4007 commented Sep 9, 2024

Why did you switch to main from development that doesn't seem right

@shiv810 shiv810 changed the base branch from main to development September 9, 2024 07:21
@shiv810
Copy link
Collaborator Author

shiv810 commented Sep 9, 2024

Why did you switch to main from development that doesn't seem right

Fixed

@shiv810
Copy link
Collaborator Author

shiv810 commented Sep 11, 2024

@0x4007 Updated the schema, comments use voyageai for embeddings now.

@shiv810 shiv810 requested a review from 0x4007 September 11, 2024 05:57
Copy link
Member

@0x4007 0x4007 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems generally okay. Let me see how your database looks

src/adapters/openai/helpers/embedding.ts Outdated Show resolved Hide resolved
src/adapters/supabase/helpers/comment.ts Outdated Show resolved Hide resolved
tests/__mocks__/adapter.ts Outdated Show resolved Hide resolved
Copy link
Contributor

github-actions bot commented Sep 11, 2024

Unused files (4)

src/handlers/add-issue.ts, src/handlers/delete-issues.ts, src/handlers/issue-deduplication.ts, src/handlers/update-issue.ts

@shiv810
Copy link
Collaborator Author

shiv810 commented Sep 11, 2024

@0x4007 Could you please check the updated changes ? Have removed OpenAI adapter, corrected the syntax and updated the dimension for the embedding column ?

@shiv810
Copy link
Collaborator Author

shiv810 commented Sep 13, 2024

@0x4007 Have Added Two Separate Tables as per the schema mentioned in the previous comment.

Screen.Recording.2024-09-13.at.1.01.47.AM.mov

embedding Vector(1024) not null,
payload jsonb,
author_id VARCHAR not null,
type text not null default 'issue',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the default issue on the comments table? Why even add the type column if it's in the comments table?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its there to ignore, bot comments, and chore issues created. So, while doing vector search we can ignore such issues and comments. I can remove that, but its a feature that could be useful while performing vector search.

@0x4007
Copy link
Member

0x4007 commented Sep 13, 2024

@0x4007 Have Added Two Separate Tables as per the schema mentioned in the previous comment.

Screen.Recording.2024-09-13.at.1.01.47.AM.mov

Seems mostly good but I didn't see all the headers on the first table. Does it have the payload? Just try and normalize the columns as much as you can.

Otherwise not sure why you added the type column if they are separated by type per table.

@shiv810
Copy link
Collaborator Author

shiv810 commented Sep 13, 2024

@0x4007 Have Added Two Separate Tables as per the schema mentioned in the previous comment.

Screen.Recording.2024-09-13.at.1.01.47.AM.mov

Seems mostly good but I didn't see all the headers on the first table. Does it have the payload? Just try and normalize the columns as much as you can.

Otherwise not sure why you added the type column if they are separated by type per table.

Yes, it includes a payload. I've retained the type column to distinguish between bot comments and bot issues. However, I can remove it if needed, as implementing that feature might be beyond the scope of this issue.

@0x4007
Copy link
Member

0x4007 commented Sep 13, 2024

Yes I think its unnecessary if they are separated by type on different tables.

@shiv810
Copy link
Collaborator Author

shiv810 commented Sep 13, 2024

Removed the type from schema. Payload is stored for both comments and issues.

Screen.Recording.2024-09-13.at.2.31.36.AM.mov

@0x4007
Copy link
Member

0x4007 commented Sep 13, 2024

Thanks for the thorough QA. You don't need to make a new video on every change! But generally when opening a pull or making major changes a video is useful.

The last idea I have (sorry for the last second changes) is to have two columns for the text plaintext and markdown

This is so we can easily do testing in the near future to compare the performance of the plaintext and markdown versions of each comment when reasoning with the LLMs. However the new ChatGPT model o1 just came out today and is supposed to be very good at reasoning, with built-in chain-of-thought reasoning capabilities. This makes me more optimistic about working with the raw markdown source code, as it provides more context (i.e. blockquotes)

Once that is implemented, you don't need to make a QA video. Just let me know and we can merge. Do that for both tables please.

@shiv810
Copy link
Collaborator Author

shiv810 commented Sep 13, 2024

@0x4007 I have added markdown and plaintext column. o1 has a great reasoning capabilities, I think ChatGPT-Plus members have access to it already, probably will take a lot of time to be GA and be available on API.

src/handlers/add-issue.ts Outdated Show resolved Hide resolved
src/handlers/add-issue.ts Outdated Show resolved Hide resolved
@0x4007
Copy link
Member

0x4007 commented Sep 13, 2024

I think ChatGPT-Plus members have access to it already, probably will take a lot of time to be GA and be available on API.

I think that only tier5 subscribers can use right now via API. I believe that we are tier4.

@shiv810 shiv810 requested a review from 0x4007 September 13, 2024 08:01
Comment on lines +73 to +88
async findSimilarIssues(markdown: string, threshold: number): Promise<IssueType[] | null> {
const embedding = await this.context.adapters.voyage.embedding.createEmbedding(markdown);
const { data, error } = await this.supabase
.from("issues")
.select("*")
.eq("type", "issue")
.textSearch("embedding", embedding.join(","))
.order("embedding", { foreignTable: "issues", ascending: false })
.lte("embedding", threshold);
if (error) {
this.context.logger.error("Error finding similar issues", error);
return [];
}
return data;
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems out of scope?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be used for issue deduplication and stuff. I think this should be here m.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disagree but no need to slow down this pull further.

@0x4007 0x4007 merged commit 60c1c3e into ubiquity-os-marketplace:development Sep 13, 2024
2 checks passed
@ubiquity-os ubiquity-os bot mentioned this pull request Sep 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fix Database Schema
3 participants