Skip to content

Commit

Permalink
Dev v0.0.3 (#25)
Browse files Browse the repository at this point in the history
* debug serper query up to 100 for each call

* Update v0.0.3 doc

* add todo
  • Loading branch information
haonan-li authored May 31, 2024
1 parent a48dfd3 commit 7b72242
Show file tree
Hide file tree
Showing 6 changed files with 49 additions and 33 deletions.
7 changes: 2 additions & 5 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,11 +17,8 @@ We welcome contributions and feedback from the community and recommend a few bes
* PRs should be titled descriptively, and be opened with a brief description of the scope and intent of the new contribution.
* New features should have appropriate documentation added alongside them.
* Aim for code maintainability, and minimize code copying.
* Minimal test are required before submit a PR, run `script/minimal_test.py` and all test cases are required to be passed.
* Please make sure the code style is checked and aligned:
```bash
pre-commit run --all-files
```
<!-- * Minimal test are required before submit a PR, run `script/minimal_test.py` and all test cases are required to be passed. -->
* Please make sure the code style is checked and aligned, see [Code Style](#code-style) for more details.

### For Feature Requests

Expand Down
15 changes: 15 additions & 0 deletions docs/RELEASE_LOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,20 @@
# Release Log

## v0.0.3

### New Features
1. **Keep Original Text:** Add the mapping from each claim to the position in the original text. Add `restore_claims` function to **decomposer**, to restore the decomposed claims to the original user input.
2. **Data Structure:** Define the data structure for several intermedia processing function and final output in `utils/data_class.py`.
3. **Speed Up:** Parallel the `restore_claims`, `identify_checkworthiness` and `query_generation` functions to speed up the pipeline.
4. **Token Count:** Add the token count for all component.
5. **Evidence-wise Verification:** Change the verification logic from input all evidence together within a single LLM call, to verify the claim by each evidence for each LLM call.
6. **Factuality Value:** Remove the deterministic output, change the factuality to a number in range [0,1], calculated by the judgement with each simple evidence.
7. **Webpage:** Redesign the webpage.
8. **Default LLM:** Change to GPT-4o.

### Bug fixed
1. **Serper Max Queries:** Serper API allows max of 100 queries in one request, we split the queries into multiple requests if the number of queries exceeds 100.
2. **Evidence and URL:** Link each evidence to the corresponding URL.

## v0.0.2

Expand Down
14 changes: 7 additions & 7 deletions docs/development_guide.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Development Guide

This documentation page provides a guide for developers to want to contribute to the Loki project, for versions v0.0.2 and later.
This documentation page provides a guide for developers to want to contribute to the Loki project, for versions v0.0.3 and later.

- [Development Guide](#development-guide)
- [Framework Introduction](#framework-introduction)
Expand All @@ -11,11 +11,11 @@ This documentation page provides a guide for developers to want to contribute to

Loki leverage state-of-the-art language models to verify the veracity of textual claims. The pipeline is designed to be modular in `factcheck/core/`, which include the following components:

- **Decomposer:** Breaks down extensive texts into digestible, independent claims, setting the stage for detailed analysis.
- **Checkworthy:** Assesses each claim's potential significance, filtering out vague or ambiguous statements to focus on those that truly matter. For example, vague claims like "MBZUAI has a vast campus" are considered unworthy because of the ambiguous nature of "vast."
- **Query Generator:** Transforms check-worthy claims into precise queries, ready to navigate the vast expanse of the internet in search of truth.
- **Evidence Retriever:** Ventures into the digital realm, retrieving relevant evidence that forms the foundation of informed verification.
- **ClaimVerify:** Examines the gathered evidence, determining the veracity of each claim to uphold the integrity of information.
- **Decomposer:** Breaks down extensive texts into digestible, independent claims, setting the stage for detailed analysis. As well as provide the mapping between the original text and the decomposed claims.
- **Checkworthy:** Assesses each claim's potential checkworthiness, filtering out vague or ambiguous statements, as well as the statement of opinion. For example, vague claims like "MBZUAI has a vast campus" are considered unworthy because of the ambiguous nature of "vast."
- **Query Generator:** Transforms check-worthy claims into precise queries, ready to navigate the vast expanse of the internet in search of evidences.
- **Evidence Retriever:** Retrieve relevant evidence that forms the foundation of informed verification, currently, for open-domain questions, we now use the google search (Serper API).
- **ClaimVerify:** Judges each evidence against the claim, determining it is supporting, refuting, or irrelevant.

To support each component's functionality, Loki relies on the following utils:
- **Language Model:** Currently, 4 out of 5 components (including: Decomposer, Checkworthy, Query Generator, and ClaimVerify) use the language model (LLMs) to perform their tasks. The supported LLMs are defined in `factcheck/core/utils/llmclient/` and can be easily extended to support more LLMs.
Expand Down Expand Up @@ -71,7 +71,7 @@ As Loki continues to evolve, our development plan focuses on broadening capabili
- **Dockerization:**
- Packaging Loki into Docker containers to simplify deployment and scale-up operations, ensuring Loki can be easily set up and maintained across different environments.

### 5. Multi-language Support
### 5. Multi-lingual Support
- **Language Expansion:**
- Support for additional languages beyond English, including Chinese, Arabic, etc, to cater to a global user base.

Expand Down
3 changes: 2 additions & 1 deletion factcheck/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,10 +30,11 @@ def __init__(
checkworthy_model: str = None,
query_generator_model: str = None,
evidence_retrieval_model: str = None,
claim_verify_model: str = None, # "gpt-3.5-turbo",
claim_verify_model: str = "gpt-3.5-turbo",
api_config: dict = None,
num_seed_retries: int = 3,
):
# TODO: better handle raw token count
self.encoding = tiktoken.get_encoding("cl100k_base")

self.prompt = prompt_mapper(prompt_name=prompt)
Expand Down
1 change: 0 additions & 1 deletion factcheck/core/CheckWorthy.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,6 @@ def identify_checkworthiness(self, texts: list[str], num_retries: int = 3, promp
list[str]: a list of checkworthy claims, pairwise outputs
"""
checkworthy_claims = texts
# TODO: better handle checkworthiness
joint_texts = "\n".join([str(i + 1) + ". " + j for i, j in enumerate(texts)])

if prompt is None:
Expand Down
42 changes: 23 additions & 19 deletions factcheck/core/Retriever/serper_retriever.py
Original file line number Diff line number Diff line change
Expand Up @@ -62,50 +62,54 @@ def _retrieve_evidence_4_all_claim(
evidences = [[] for _ in query_list]

# get the response from serper
# TODO: Can send up to 100 queries once
serper_response = self._request_serper_api(query_list)

if serper_response is None:
logger.error("Serper API request error!")
return evidences
serper_responses = []
for i in range(0, len(query_list), 100):
batch_query_list = query_list[i : i + 100]
batch_response = self._request_serper_api(batch_query_list)
if batch_response is None:
logger.error("Serper API request error!")
return evidences
else:
serper_responses += batch_response.json()

# get the results for queries with an answer box
# get the responses for queries with an answer box
query_url_dict = {}
url_to_date = {} # TODO: decide whether to use date
_snippet_to_check = []
for i, (query, result) in enumerate(zip(query_list, serper_response.json())):
if query != result.get("searchParameters").get("q"):
logger.error("Serper change query from {} TO {}".format(query, result.get("searchParameters").get("q")))
for i, (query, response) in enumerate(zip(query_list, serper_responses)):
if query != response.get("searchParameters").get("q"):
logger.error("Serper change query from {} TO {}".format(query, response.get("searchParameters").get("q")))

if "answerBox" in result:
if "answer" in result["answerBox"]:
# TODO: provide the link for the answer box
if "answerBox" in response:
if "answer" in response["answerBox"]:
evidences[i] = [
{
"text": f"{query}\nAnswer: {result['answerBox']['answer']}",
"text": f"{query}\nAnswer: {response['answerBox']['answer']}",
"url": "Google Answer Box",
}
]
else:
evidences[i] = [
{
"text": f"{query}\nAnswer: {result['answerBox']['snippet']}",
"text": f"{query}\nAnswer: {response['answerBox']['snippet']}",
"url": "Google Answer Box",
}
]
# TODO: currently --- if there is google answer box, we only got 1 evidence, otherwise, we got multiple, this will deminish the value of the google answer.
else:
results = result.get("organic", [])[:top_k] # Choose top 5 result
topk_results = response.get("organic", [])[:top_k] # Choose top 5 response

if (len(_snippet_to_check) == 0) or (not snippet_extend_flag):
evidences[i] += [
{"text": re.sub(r"\n+", "\n", _result["snippet"]), "url": _result["link"]} for _result in results
{"text": re.sub(r"\n+", "\n", _result["snippet"]), "url": _result["link"]} for _result in topk_results
]

# Save date for each url
url_to_date.update({result.get("link"): result.get("date") for result in results})
url_to_date.update({_result.get("link"): _result.get("date") for _result in topk_results})
# Save query-url pair, 1 query may have multiple urls
query_url_dict.update({query: [result.get("link") for result in results]})
_snippet_to_check += [result["snippet"] if "snippet" in result else "" for result in results]
query_url_dict.update({query: [_result.get("link") for _result in topk_results]})
_snippet_to_check += [_result["snippet"] if "snippet" in _result else "" for _result in topk_results]

# return if there is no snippet to check or snippet_extend_flag is False
if (len(_snippet_to_check) == 0) or (not snippet_extend_flag):
Expand Down

0 comments on commit 7b72242

Please sign in to comment.