Skip to content

Commit

Permalink
.
Browse files Browse the repository at this point in the history
  • Loading branch information
penguine-ip committed Nov 19, 2024
1 parent b45718c commit d55d8f2
Show file tree
Hide file tree
Showing 2 changed files with 37 additions and 18 deletions.
14 changes: 10 additions & 4 deletions deepeval/dataset/dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -550,10 +550,16 @@ def push(
overwrite: Optional[bool] = None,
auto_convert_test_cases_to_goldens: bool = False,
):
if len(self.test_cases) == 0 and len(self.goldens) == 0:
raise ValueError(
"Unable to push empty dataset to Confident AI, there must be at least one test case or golden in dataset"
)
if auto_convert_test_cases_to_goldens is False:
if len(self.goldens) == 0:
raise ValueError(
"Unable to push empty dataset to Confident AI, there must be at least one golden in dataset. To include test cases, set 'auto_convert_test_cases_to_goldens' to True."
)
else:
if len(self.test_cases) == 0 and len(self.goldens) == 0:
raise ValueError(
"Unable to push empty dataset to Confident AI, there must be at least one test case or golden in dataset"
)
if is_confident():
goldens = self.goldens
if auto_convert_test_cases_to_goldens:
Expand Down
41 changes: 27 additions & 14 deletions docs/docs/confident-ai-evaluation-dataset-management.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -34,16 +34,20 @@ Alternatively, you can also choose to upload entire datasets from CSV files. Sim

Pushing an `EvaluationDataset` on Confident using `deepeval` is a two-step process:

1. Create a dataset locally (same as how you would create a dataset as shown in the [datasets section](evaluation-datasets))
2. Push the created dataset to Confident
1. Create a dataset locally (same as how you would create a dataset as shown in the [datasets section](evaluation-datasets)).
2. Populate it with `Golden`s.
3. Push the new dataset to Confident AI.

:::warning
Although you can also populate an `EvaluationDataset` with `LLMTestCase`s, we **HIGHLY** recommend that you do it with `Golden`s instead as it is more flexible to work with when dealing with datasets.
:::

### Create A Dataset Locally

Here's a quick example:
Here's a quick example of populating an `EvaluationDataset` with `Golden`s before pushing it to Confident AI:

```python
from deepeval.test_case import LLMTestCase
from deepeval.dataset import EvaluationDataset
from deepeval.dataset import EvaluationDataset, Golden

original_dataset = [
{
Expand All @@ -66,44 +70,53 @@ original_dataset = [
},
]

test_cases = []
goldens = []
for datapoint in original_dataset:
input = datapoint.get("input", None)
actual_output = datapoint.get("actual_output", None)
expected_output = datapoint.get("expected_output", None)
context = datapoint.get("context", None)

test_case = LLMTestCase(
golden = Golden(
input=input,
actual_output=actual_output,
expected_output=expected_output,
context=context
)
test_cases.append(test_case)
goldens.append(golden)

dataset = EvaluationDataset(test_cases=test_cases)
dataset = EvaluationDataset(goldens=goldens)
```

### Push Dataset to Confident AI

After creating your `EvaluationDataset`, all you have to do is push it to Confident by providing an `alias` as an unique identifier:
After creating your `EvaluationDataset`, all you have to do is push it to Confident by providing an `alias` as an unique identifier. When you push an `EvaluationDataset`, the data is being uploaded as `Golden`s, **NOT** `LLMTestCase`s:

```python
...

# Provide an alias when pushing a dataset
dataset.push(alias="My Confident Dataset")
```

:::tip Did you know?
You can choose to overwrite or append to an existing dataset if an existing dataset with the same alias already exist.
The `push()` method will upload all `Goldens` found in your dataset to Confident AI, ignoring any `LLMTestCase`s. If you wish to also include `LLMTestCase`s in the push, you can set the `auto_convert_test_cases_to_goldens` parameter to `True`:

```python
...

dataset.push(alias="My Confident Dataset", auto_convert_test_cases_to_goldens=True)
```

You can also choose to overwrite or append to an existing dataset if an existing dataset with the same alias already exist.

```python
...

dataset.push(alias="My Confident Dataset", overwrite=False)
```

`deepeval` will prompt you in the terminal if no value for `overwrite` is provided.

:::

## What is a Golden?

A "Golden" is what makes up an evaluation dataset and is very similar to a test case in `deepeval`, but they:
Expand Down

0 comments on commit d55d8f2

Please sign in to comment.