Skip to content

Commit

Permalink
Tests Refactoring (#740)
Browse files Browse the repository at this point in the history
This PR aims to refactor the test suite into a more generic version.

The following points are to be covered (subject to change; will become
more granular):
- [x] Add more granularity to test viewer for better test
overview/readability and more convenient testing of subgroups of methods
(e.g. only exectue LoRA tests for one model)
- [x] Add documentation about the structure of the test directory and
entry points
- [x] Add pytest markers to allow for more granular testing of subgroups
(e.g. exectue only LoRA tests for all models at once, useful when
implementing a new adapter method)
- [x] Refactor test methods to extract similar/duplicate code into a set
of utils methods
- [x] Relocate edgecases due to model peculiarities out of the tests and
into the respective model test classes to keeps the tests generic
 - [x] Fix config union tests, closes #785 

This should make it easier to add new models or methods to the existing
test suite in the future and make the development/testing process more
convenient in the future
  • Loading branch information
TimoImhof authored Jan 27, 2025
1 parent adef6dc commit 1dcac5c
Show file tree
Hide file tree
Showing 119 changed files with 1,655 additions and 2,006 deletions.
19 changes: 15 additions & 4 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -28,18 +28,29 @@ style:
isort $(check_dirs)
${MAKE} extra_style_checks

# Run tests for the library
# Library Tests

# run all tests in the library
test:
python -m pytest -n auto --dist=loadfile -s -v ./tests/
python -c "import transformers; print(transformers.__version__)"

# run all tests for the adapter methods for all adapter models
test-adapter-methods:
python -m pytest --ignore ./tests/models -n auto --dist=loadfile -s -v ./tests/
python -m pytest -n auto --dist=loadfile -s -v ./tests/test_methods/

# run a subset of the adapter method tests for all adapter models
# list of all subsets: [core, heads, embeddings, composition, prefix_tuning, prompt_tuning, reft, unipelt, compacter, bottleneck, ia3, lora, config_union]
subset ?=
test-adapter-method-subset:
@echo "Running subset $(subset)"
python -m pytest -n auto --dist=loadfile -s -v ./tests/test_methods/ -m $(subset)


# run the hugginface test suite for all adapter models
test-adapter-models:
python -m pytest -n auto --dist=loadfile -s -v ./tests/models
python -m pytest -n auto --dist=loadfile -s -v ./tests/test_models/

# Run tests for examples

test-examples:
python -m pytest -n auto --dist=loadfile -s -v ./examples/pytorch/
5 changes: 5 additions & 0 deletions conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -87,3 +87,8 @@ def check_output(self, want, got, optionflags):


doctest.OutputChecker = CustomOutputChecker


def pytest_collection_modifyitems(items):
# Exclude the 'test_class' group from the test collection since it's not a real test class and byproduct of the generic test class generation.
items[:] = [item for item in items if 'test_class' not in item.nodeid]
2 changes: 1 addition & 1 deletion examples/pytorch/language-modeling/run_clm.py
Original file line number Diff line number Diff line change
Expand Up @@ -442,7 +442,7 @@ def main():
else:
model = AutoModelForCausalLM.from_config(config, trust_remote_code=model_args.trust_remote_code)
n_params = sum({p.data_ptr(): p.numel() for p in model.parameters()}.values())
logger.info(f"Training new model from scratch - Total size={n_params/2**20:.2f}M params")
logger.info(f"Training new model from scratch - Total size={n_params / 2**20:.2f}M params")

# Convert the model into an adapter model
adapters.init(model)
Expand Down
15 changes: 13 additions & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,10 +1,21 @@
[tool.black]
line-length = 119
target-version = ['py38', 'py39', 'py310']

# copied from HF for testing
[tool.pytest.ini_options]
markers = [
"core: marks tests as core adapter test",
"composition: marks tests as composition adapter test",
"heads: marks tests as heads adapter test",
"embeddings: marks tests as embeddings adapter test",
"class_conversion: marks tests as class conversion adapter test",
"prefix_tuning: marks tests as prefix tuning adapter test",
"prompt_tuning: marks tests as prompt tuning adapter test",
"reft: marks tests as reft adapter test",
"unipelt: marks tests as unipelt adapter test",
"compacter: marks tests as compacter adapter test",
"bottleneck: marks tests as bottleneck adapter test",
"ia3: marks tests as ia3 adapter test",
"lora: marks tests as lora adapter test",
"flash_attn_test: marks tests related to flash attention (deselect with '-m \"not flash_attn_test\"')",
"bitsandbytes: select (or deselect with `not`) bitsandbytes integration tests",
"generate: marks tests that use the GenerationTesterMixin"
Expand Down
150 changes: 150 additions & 0 deletions tests/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,150 @@
# Testing the Adapters Library

This README provides a comprehensive overview of the test directory organization and explains how to execute different types of tests within the adapters library.

## Test Directory Structure Overview

```
tests/
├── __init__.py
├── fixtures/ # Datasets, test samples, ...
| └── ...
├── test_methods/ # Dynamic adapter method tests (all models)
│ ├── __init__.py
│ ├── method_test_impl/ # Implementation of tests
│ │ ├── __init__.py
│ │ ├── core/
│ │ ├── composition/
│ │ └── ...
│ ├── base.py # Base from which model test bases inherit
│ ├── generator.py # Testcase generation and registration
│ ├── test_on_albert.py # Example model test base for testing adapter methods on albert adapter model
│ ├── test_on_beit.py
│ └── ...
├── test_misc/ # Miscellaneous adapter method tests (single model)
│ ├── test_adapter_config.py
│ └── ...
├── test_models/ # Adapter model tests with Hugging Face test suite
│ └── __init__.py
│ │ ├── base.py
│ │ ├── test_albert_model.py
│ │ └── ...
```

## Test Categories

The testing framework encompasses three distinct categories of tests:

1. Dynamic Adapter Method Tests: These tests cover core functionalities of the adapters library, including individual adapter methods (such as LoRA and prompt tuning) and head functionalities. These tests are executed across all supported models.

2. Miscellaneous Adapter Method Tests: These supplementary tests cover scenarios not included in the dynamic tests. To optimize resources, they are executed on a single model, as repeated execution across multiple models would not provide additional value.

3. Adapter Model Tests: These tests verify the implementation of the adapter models themselves using the Hugging Face model test suite.

## Test Generator and Pytest Markers

The test_methods directory contains the central component `generator.py`, which generates appropriate sets of adapter method tests. Each model test base registers these tests using the following pattern:

```python
method_tests = generate_method_tests(AlbertAdapterTestBase)

for test_class_name, test_class in method_tests.items():
globals()[test_class_name] = test_class
```

Each generated test class is decorated with a specific marker type. For example:

```python
@require_torch
@pytest.mark.lora
class LoRA(
AlbertAdapterTestBase,
LoRATestMixin,
unittest.TestCase,
):
pass
```

These markers enable the execution of specific test types across all models. You can run these tests using either of these methods:

1. Using the make command:
```bash
make test-adapter-method-subset subset=lora
```

2. Directly executing from the test directory:
```bash
cd tests/test_methods
pytest -m lora
```

Both approaches will execute all LoRA tests across every model in the adapters library.

## Adding a New Adapter Method to the Test Suite

The modular design of the test base simplifies the process of adding tests for new adapter methods. To add tests for a new adapter method "X", follow these steps:

1. Create the Test Implementation:
Create a new file `tests/test_methods/method_test_impl/peft/test_X.py` and implement the test mixin class:

```python
@require_torch
class XTestMixin(AdapterMethodBaseTestMixin):

default_config = XConfig()

def test_add_X(self):
model = self.get_model()
self.run_add_test(model, self.default_config, ["adapters.{name}."])

def ...
```

2. Register the Test Mixin:
Add the new test mixin class to `tests/test_methods/generator.py`:

```python
from tests.test_methods.method_test_impl.peft.test_X import XTestMixin

def generate_method_tests(model_test_base, ...):
""" Generate method tests for the given model test base """
test_classes = {}

@require_torch
@pytest.mark.core
class Core(
model_test_base,
CompabilityTestMixin,
AdapterFusionModelTestMixin,
unittest.TestCase,
):
pass

if "Core" not in excluded_tests:
test_classes["Core"] = Core

@require_torch
@pytest.mark.X
class X(
model_test_base,
XTestMixin,
unittest.TestCase,
):
pass

if "X" not in excluded_tests:
test_classes["X"] = X
```

The pytest marker enables execution of the new method's tests across all adapter models using:
```bash
make test-adapter-method-subset subset=X
```

If the new method is incompatible with specific adapter models, you can exclude the tests in the respective `test_on_xyz.py` file:

```python
method_tests = generate_method_tests(BartAdapterTestBase, excluded_tests=["PromptTuning", "X"])
```

Note: It is recommended to design new methods to work with the complete library whenever possible. Only exclude tests when there are unavoidable compatibility issues and make them clear in the documenation.
File renamed without changes.
42 changes: 0 additions & 42 deletions tests/methods/__init__.py

This file was deleted.

Loading

0 comments on commit 1dcac5c

Please sign in to comment.