-
Notifications
You must be signed in to change notification settings - Fork 672
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement support to BatchAPIs to gather evidence #687
base: main
Are you sure you want to change the base?
Conversation
This class is used to submit batch calls to the OpenAI batch API
also added a dependency group in pyproject.toml to install openai and anthropic only if the user wants to use batches, refactored the logic of sumarizing evidences in batch and moved the code to core.py
…ke it compatible with #680
Also bugfix in tests and created Enums to avoid hardcoding the batch status identifiers
The timelimit and the pooling time for the batches are now in the Settings
|
||
for _, llm_result in results: | ||
session.add_tokens(llm_result) | ||
|
||
session.contexts += [r for r, _ in results if r is not None] | ||
session.contexts += [r for r, _ in results] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why did we cut the r is not None
filter here? I would think that the results from gather_with_concurrency
could still be None
on failure, but maybe I'm wrong
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This gets the Context
s from gather_with_concurrency
or gather_with_batch
. And both always return list of tuples with (Context
, LLMResult
). What can happen is to have an empty text in Context.text
, but it seems to me that r is always an instance of Context
.
Also, I didn't see any case of map_fxn_summary
returning None
while studying the code, and mypy also complains that r is None
is always a True statement.
Maybe that's an edge case that I didn't see?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we correctly type hinted gather_with_concurrency
then this would be resolved. @maykcaldas can you adjust it to be this?
T = TypeVar("T")
async def gather_with_concurrency(n: int, coros: Iterable[Awaitable[T]]) -> list[T]:
...
```
A more general solution is to include it i the field [dependency-groups] of pyproject.toml
|
||
for _, llm_result in results: | ||
session.add_tokens(llm_result) | ||
|
||
session.contexts += [r for r, _ in results if r is not None] | ||
session.contexts += [r for r, _ in results] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we correctly type hinted gather_with_concurrency
then this would be resolved. @maykcaldas can you adjust it to be this?
T = TypeVar("T")
async def gather_with_concurrency(n: int, coros: Iterable[Awaitable[T]]) -> list[T]:
...
```
This PR implements support to send requests to OpenAI and Anthropic batch APIs. Due to the parallel nature of gathering evidence and summarizing all candidate papers, we plan to use the batch API when possible.
The use of the batch API depends on
Settings.use_batch_in_summary
. Therefore, paperqa workflows would still be unchanged in case this setting is set to False (default). Currently, using a batch keeps the process busy while the batch isn't finished on the LLM provider side, which could take up to 24 hours. This scaling issue will be addressed in another PR.Task list
get_summary_llm
to decide which provider to use given the llm in the configpytest.mark.vcr
in the tests to avoid creating batches for every test