-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Check overhead for comment evaluation #174
Comments
Note The following contributors may be suitable for this task: gentlementlegen
0x4007
|
/start |
Tip
|
Was thinking about this and maybe there would be a few available approches:
Any ideas? @sshivaditya2019 RFC |
I think high accuracy is the best choice from your selection. I think costs continue to decline with these LLMs as well. |
Let me test results with TF IDF first and see how accurate it gets, because it would also most likely be much simpler to implement that a summary of all the comments, will run some tests and post them here. |
|
View | Contribution | Count | Reward |
---|---|---|---|
Issue | Task | 1 | 100 |
Issue | Specification | 1 | 15.94 |
Conversation Incentives
Comment | Formatting | Relevance | Priority | Reward |
---|---|---|---|---|
> ```diff@gentlementlegen perhaps we have too m… | 7.97content: content: p: score: 0 elementCount: 2 em: score: 0 elementCount: 1 a: score: 5 elementCount: 1 result: 5 regex: wordCount: 54 wordValue: 0.1 result: 2.97 | 1 | 2 | 15.94 |
[ 47.998 WXDAI ]
@0x4007
Contributions Overview
View | Contribution | Count | Reward |
---|---|---|---|
Issue | Comment | 1 | 0 |
Review | Comment | 22 | 47.998 |
Conversation Incentives
Comment | Formatting | Relevance | Priority | Reward |
---|---|---|---|---|
I think high accuracy is the best choice from your selection. I … | 1.38content: content: p: score: 0 elementCount: 1 result: 0 regex: wordCount: 22 wordValue: 0.1 result: 1.38 | 0 | 2 | 0 |
Very skeptical of tfidf approach. We should go simpler and filte… | 0.94content: content: p: score: 0 elementCount: 1 result: 0 regex: wordCount: 14 wordValue: 0.1 result: 0.94 | 0.7 | 2 | 1.316 |
This depends on the model and possibly should be an environment … | 1.11content: content: p: score: 0 elementCount: 1 result: 0 regex: wordCount: 17 wordValue: 0.1 result: 1.11 | 0.8 | 2 | 1.776 |
We should also filter out slash commands? And minimized comments? | 0.71content: content: p: score: 0 elementCount: 1 result: 0 regex: wordCount: 10 wordValue: 0.1 result: 0.71 | 0.6 | 2 | 0.852 |
I'm skeptical about this whole TFIDF approach1. The tokenizer a… | 7.64content: content: p: score: 0 elementCount: 1 ol: score: 0 elementCount: 1 li: score: 0.5 elementCount: 3 result: 1.5 regex: wordCount: 127 wordValue: 0.1 result: 6.14 | 0.9 | 2 | 13.752 |
Can you articulate the weaknesses or concerns | 0.52content: content: p: score: 0 elementCount: 1 result: 0 regex: wordCount: 7 wordValue: 0.1 result: 0.52 | 0.5 | 2 | 0.52 |
Hard coding the 12400 doesn't seem like a solution there either | 0.83content: content: p: score: 0 elementCount: 1 result: 0 regex: wordCount: 12 wordValue: 0.1 result: 0.83 | 0.7 | 2 | 1.162 |
Line 179 is hard coded | 0.39content: content: p: score: 0 elementCount: 1 result: 0 regex: wordCount: 5 wordValue: 0.1 result: 0.39 | 0.6 | 2 | 0.468 |
Yes if we don't have it saved in our library or collection of kn… | 1.28content: content: p: score: 0 elementCount: 1 result: 0 regex: wordCount: 20 wordValue: 0.1 result: 1.28 | 0.6 | 2 | 1.536 |
It shouldn't affect it at all. I would proceed with implicit app… | 1.44content: content: p: score: 0 elementCount: 1 result: 0 regex: wordCount: 23 wordValue: 0.1 result: 1.44 | 0.4 | 2 | 1.152 |
Manually get the numbers from their docs then | 0.59content: content: p: score: 0 elementCount: 1 result: 0 regex: wordCount: 8 wordValue: 0.1 result: 0.59 | 0.6 | 2 | 0.708 |
Why is this a constant? Makes more sense to use let and directly… | 1.28content: content: p: score: 0 elementCount: 1 result: 0 regex: wordCount: 20 wordValue: 0.1 result: 1.28 | 0.7 | 2 | 1.792 |
```suggestion``` | 0content: content: {} result: 0 regex: wordCount: 0 wordValue: 0.1 result: 0 | 0.2 | 2 | 0 |
Add more chunks if the request to OpenAI fails for being too lon… | 2.1content: content: p: score: 0 elementCount: 1 result: 0 regex: wordCount: 36 wordValue: 0.1 result: 2.1 | 0.8 | 2 | 3.36 |
@shiv810 rfc | 0.18content: content: p: score: 0 elementCount: 1 result: 0 regex: wordCount: 2 wordValue: 0.1 result: 0.18 | 0.3 | 2 | 0.108 |
Separate is fine then just as long as the current code is stable. | 0.88content: content: p: score: 0 elementCount: 1 result: 0 regex: wordCount: 13 wordValue: 0.1 result: 0.88 | 0.5 | 2 | 0.88 |
More careful filtering of comments like removal of bot commands … | 2.05content: content: p: score: 0 elementCount: 1 result: 0 regex: wordCount: 35 wordValue: 0.1 result: 2.05 | 0.8 | 2 | 3.28 |
Doing multiple calls to score everything and then concatenate re… | 1.17content: content: p: score: 0 elementCount: 1 result: 0 regex: wordCount: 18 wordValue: 0.1 result: 1.17 | 0.7 | 2 | 1.638 |
Divide into two and do 150 each call. Receive the results array … | 1.06content: content: p: score: 0 elementCount: 1 result: 0 regex: wordCount: 16 wordValue: 0.1 result: 1.06 | 0.6 | 2 | 1.272 |
Surely it's a bit of a trade off without all of the comments in … | 5.58content: content: p: score: 0 elementCount: 1 ol: score: 0 elementCount: 1 li: score: 0.5 elementCount: 2 result: 1 regex: wordCount: 90 wordValue: 0.1 result: 4.58 | 0.75 | 2 | 8.37 |
It's hard for me to tell from the QA example but if it works it … | 2.2content: content: p: score: 0 elementCount: 1 result: 0 regex: wordCount: 38 wordValue: 0.1 result: 2.2 | 0.5 | 2 | 2.2 |
For your low token limit example, I think your config was wrong … | 1.49content: content: p: score: 0 elementCount: 1 result: 0 regex: wordCount: 24 wordValue: 0.1 result: 1.49 | 0.4 | 2 | 1.192 |
Relevance 1 is not expected of course unless it's the spec. | 0.83content: content: p: score: 0 elementCount: 1 result: 0 regex: wordCount: 12 wordValue: 0.1 result: 0.83 | 0.4 | 2 | 0.664 |
[ 20.936 WXDAI ]
@whilefoo
Contributions Overview
View | Contribution | Count | Reward |
---|---|---|---|
Review | Comment | 3 | 20.936 |
Conversation Incentives
Comment | Formatting | Relevance | Priority | Reward |
---|---|---|---|---|
Are we sending user comments twice? Maybe we could decrease toke… | 2.25content: content: p: score: 0 elementCount: 2 result: 0 regex: wordCount: 39 wordValue: 0.1 result: 2.25 | 0.9 | 2 | 4.05 |
Yeah that's why we could leave `allComments` so it makes… | 3.02content: content: p: score: 0 elementCount: 2 result: 0 regex: wordCount: 55 wordValue: 0.1 result: 3.02 | 0.8 | 2 | 4.832 |
I didn't know [prompt caching](https://platform.openai.com/docs/… | 8.61content: content: p: score: 0 elementCount: 3 a: score: 5 elementCount: 1 result: 5 regex: wordCount: 68 wordValue: 0.1 result: 3.61 | 0.7 | 2 | 12.054 |
@gentlementlegen is this expected ? I don't see a permit generated for my id. |
@shiv810 Yes, because your profile is private, you do not appear as part of |
I thought we fixed that by checking collaborators on the repository level. |
I suppose the right thing to do is to add every core team member to every organization. I wanted to experiment with not having to do this in order to operate like a "real DAO" but I realize that there is a need for a distinction between "trusted" and "not trusted" contributor especially for:
In the future, an XP system should be able to handle this dynamically. |
@shiv810 I can regenerate once you accept your invitation https://github.com/ubiquity-os-marketplace @gentlementlegen I'm assuming that the status is inherited from the organization level. |
There are two ways currently to be considered as a collaborator:
|
So then this is the only solution if the collaborator has a private profile. I wonder if there is a solution for them to be added to the org. |
Well they can be added to the organization but the API won't be able to retrieve the information from that user, since the "private" mode hides all this information. |
I wonder if we can build a shim for this problem in the form of some type of persistent JSON storage. Synchronizing would be difficult to do in realtime though, which matters more for if they were removed from the team. |
This would mean that we should manually keep that list updated, and third parties most likely wouldn't enjoy that either. I am not sure to understand why people don't create burner accounts to make them public and only used in our organization for example. |
Well to be more specific my vision was to append to this cache if another module detects that they are part of the organization collaborators, or if they set a label or performed some privileged action. |
@gentlementlegen perhaps we have too much overhead with each pull? And by that I mean headers and such not the main content? Because I don't imagine that each pull actually has that much "body" content. This easily can be optimized as I see some have barely any comments.
Originally posted by @0x4007 in ubiquity-os/ubiquity-os-kernel#80 (comment)
The text was updated successfully, but these errors were encountered: