-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add OFV market resolver #225
Conversation
pyproject.toml
Outdated
py-multicodec = "==0.2.1" | ||
grpcio = "==1.53.0" | ||
python = ">=3.10,<3.12" | ||
open-autonomy = { git = "https://github.com/kongzii/open-autonomy.git", rev = "13344d6551222224492024623cd10aa79ad1a13e" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the plan for the dependency updates in related repositories, please?
And I know Evan was working on some dependency-related stuff as well, I'll get in touch with him about the current status.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the issue you're facing? Did you have to make adjustments to open-autonomy? If so, feel free to make a PR on the main open-autonomy repo, we can take a look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, PR for open-autonomy is open: valory-xyz/open-autonomy#2223
And for tomte too: valory-xyz/tomte#27
The issue we are facing is that with current strict versioning, it's almost impossible to use it with other libraries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we were to merge this, it means that we need to keep using your fork until we reflect your changes on our framework.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then maybe you could merge and release changes in these two PRs so I can use them properly here?
@richardblythman Could you please explain why this pattern of doing
or having
is called in the same way by doing |
@kongzii I think this would have worked, but it's not what you had originally. |
@richardblythman Yeah, I see the changes in arguments, that's clear to me that it needs to have the same interface as other tools to make it simpler for calling from agents/mech. I just didn't understand why the **kwargs. |
@kongzii I just copied and pasted. Not sure if the team at Valory prefer one over the other. |
# Conflicts: # packages/packages.json # packages/valory/agents/mech/aea-config.yaml # packages/valory/customs/prediction_request/component.yaml # packages/valory/services/mech/service.yaml # packages/valory/skills/task_execution/skill.yaml
07ce9a2
to
3e5f5b7
Compare
4bd1df3
to
df9df8e
Compare
Proposed changes
Implements a new tool that can be used for market resolution.
Types of changes
What types of changes does your code introduce? (A breaking change is a fix or feature that would cause existing functionality and APIs to not work as expected.)
Put an
x
in the box that appliesChecklist
Put an
x
in the boxes that apply.main
branch (left side). Also you should start your branch off ourmain
.Further comments
I used a customised fork of OpenFactVerifier (PR is currently open again their main branch) to create a new market resolver.
The steps are:
is_predictable_binary
function.Will former Trump Organization CFO Allen Weisselberg be sentenced to jail by 15 April 2024?
would be rewritten toFormer Trump Organization CFO Allen Weisselberg was sentenced to jail by 15 April 2024.
.I also implemented a benchmark where I compare (1) current resolution visible on Omen, (2) resolution obtained by
packages.napthaai.customs.resolve_market_reasoning.resolve_market_reasoning
by running it today and (3) resolution by OFV. The results are:However, I don't understand why (2) is so low, I expected it to be at least a little better than (1), because new information is available on internet. If you can point out a bug in my implementation, I'd re-run the benchmark again.
The benchmark is run over 50 markets that I resolved manually and the full results are available here.
I will pick just the ones where OFV made mistakes:
As we can see, 3 out of 4 mistakes are questionable.