Tests Refactoring (#740)

This PR aims to refactor the test suite into a more generic version. The following points are to be covered (subject to change; will become more granular): - [x] Add more granularity to test viewer for better test overview/readability and more convenient testing of subgroups of methods (e.g. only exectue LoRA tests for one model) - [x] Add documentation about the structure of the test directory and entry points - [x] Add pytest markers to allow for more granular testing of subgroups (e.g. exectue only LoRA tests for all models at once, useful when implementing a new adapter method) - [x] Refactor test methods to extract similar/duplicate code into a set of utils methods - [x] Relocate edgecases due to model peculiarities out of the tests and into the respective model test classes to keeps the tests generic - [x] Fix config union tests, closes #785 This should make it easier to add new models or methods to the existing test suite in the future and make the development/testing process more convenient in the future
adapter-hub · Jan 27, 2025 · 1dcac5c · 1dcac5c
1 parent adef6dc
commit 1dcac5c
Show file tree

Hide file tree

Showing 119 changed files with 1,655 additions and 2,006 deletions.
diff --git a/Makefile b/Makefile
@@ -28,18 +28,29 @@ style:
 	isort $(check_dirs)
 	${MAKE} extra_style_checks
 
-# Run tests for the library
+# Library Tests
 
+# run all tests in the library
 test:
 	python -m pytest -n auto --dist=loadfile -s -v ./tests/
+	python -c "import transformers; print(transformers.__version__)"
 
+# run all tests for the adapter methods for all adapter models
 test-adapter-methods:
-	python -m pytest --ignore ./tests/models -n auto --dist=loadfile -s -v ./tests/
+	python -m pytest -n auto --dist=loadfile -s -v ./tests/test_methods/
 
+# run a subset of the adapter method tests for all adapter models
+# list of all subsets: [core, heads, embeddings, composition, prefix_tuning, prompt_tuning, reft, unipelt, compacter, bottleneck, ia3, lora, config_union]
+subset ?=
+test-adapter-method-subset:
+	@echo "Running subset $(subset)"
+	python -m pytest -n auto --dist=loadfile -s -v ./tests/test_methods/ -m $(subset)
+
+
+# run the hugginface test suite for all adapter models
 test-adapter-models:
-	python -m pytest -n auto --dist=loadfile -s -v ./tests/models
+	python -m pytest -n auto --dist=loadfile -s -v ./tests/test_models/
 
 # Run tests for examples
-
 test-examples:
 	python -m pytest -n auto --dist=loadfile -s -v ./examples/pytorch/
diff --git a/conftest.py b/conftest.py
@@ -87,3 +87,8 @@ def check_output(self, want, got, optionflags):
 
 
 doctest.OutputChecker = CustomOutputChecker
+
+
+def pytest_collection_modifyitems(items):
+    # Exclude the 'test_class' group from the test collection since it's not a real test class and byproduct of the generic test class generation.
+    items[:] = [item for item in items if 'test_class' not in item.nodeid]
diff --git a/examples/pytorch/language-modeling/run_clm.py b/examples/pytorch/language-modeling/run_clm.py
@@ -442,7 +442,7 @@ def main():
     else:
         model = AutoModelForCausalLM.from_config(config, trust_remote_code=model_args.trust_remote_code)
         n_params = sum({p.data_ptr(): p.numel() for p in model.parameters()}.values())
-        logger.info(f"Training new model from scratch - Total size={n_params/2**20:.2f}M params")
+        logger.info(f"Training new model from scratch - Total size={n_params / 2**20:.2f}M params")
 
     # Convert the model into an adapter model
     adapters.init(model)

diff --git a/pyproject.toml b/pyproject.toml
@@ -1,10 +1,21 @@
 [tool.black]
 line-length = 119
 target-version = ['py38', 'py39', 'py310']
-
-# copied from HF for testing
 [tool.pytest.ini_options]
 markers = [
+    "core: marks tests as core adapter test",
+    "composition: marks tests as composition adapter test",
+    "heads: marks tests as heads adapter test",
+    "embeddings: marks tests as embeddings adapter test",
+    "class_conversion: marks tests as class conversion adapter test",
+    "prefix_tuning: marks tests as prefix tuning adapter test",
+    "prompt_tuning: marks tests as prompt tuning adapter test",
+    "reft: marks tests as reft adapter test",
+    "unipelt: marks tests as unipelt adapter test",
+    "compacter: marks tests as compacter adapter test",
+    "bottleneck: marks tests as bottleneck adapter test",
+    "ia3: marks tests as ia3 adapter test",
+    "lora: marks tests as lora adapter test",
     "flash_attn_test: marks tests related to flash attention (deselect with '-m \"not flash_attn_test\"')",
     "bitsandbytes: select (or deselect with `not`) bitsandbytes integration tests",
     "generate: marks tests that use the GenerationTesterMixin"

diff --git a/tests/README.md b/tests/README.md
@@ -0,0 +1,150 @@
+# Testing the Adapters Library
+
+This README provides a comprehensive overview of the test directory organization and explains how to execute different types of tests within the adapters library.
+
+## Test Directory Structure Overview
+
+```
+tests/
+├── __init__.py
+├── fixtures/                       # Datasets, test samples, ...
+|   └── ...
+├── test_methods/                   # Dynamic adapter method tests (all models)
+│   ├── __init__.py
+│   ├── method_test_impl/               # Implementation of tests
+│   │   ├── __init__.py
+│   │   ├── core/
+│   │   ├── composition/
+│   │   └── ...
+│   ├── base.py                     # Base from which model test bases inherit
+│   ├── generator.py                    # Testcase generation and registration
+│   ├── test_on_albert.py               # Example model test base for testing adapter methods on albert adapter model
+│   ├── test_on_beit.py 
+│   └── ...
+├── test_misc/                      # Miscellaneous adapter method tests (single model)
+│   ├── test_adapter_config.py 
+│   └── ...
+├── test_models/                    # Adapter model tests with Hugging Face test suite
+│   └── __init__.py
+│   │   ├── base.py
+│   │   ├── test_albert_model.py
+│   │   └── ...
+```
+
+## Test Categories
+
+The testing framework encompasses three distinct categories of tests:
+
+1. Dynamic Adapter Method Tests: These tests cover core functionalities of the adapters library, including individual adapter methods (such as LoRA and prompt tuning) and head functionalities. These tests are executed across all supported models.
+
+2. Miscellaneous Adapter Method Tests: These supplementary tests cover scenarios not included in the dynamic tests. To optimize resources, they are executed on a single model, as repeated execution across multiple models would not provide additional value.
+
+3. Adapter Model Tests: These tests verify the implementation of the adapter models themselves using the Hugging Face model test suite.
+
+## Test Generator and Pytest Markers
+
+The test_methods directory contains the central component `generator.py`, which generates appropriate sets of adapter method tests. Each model test base registers these tests using the following pattern:
+
+```python
+method_tests = generate_method_tests(AlbertAdapterTestBase)
+
+for test_class_name, test_class in method_tests.items():
+    globals()[test_class_name] = test_class
+```
+
+Each generated test class is decorated with a specific marker type. For example:
+
+```python
+@require_torch
+@pytest.mark.lora
+class LoRA(
+    AlbertAdapterTestBase,
+    LoRATestMixin,
+    unittest.TestCase,
+):
+    pass
+```
+
+These markers enable the execution of specific test types across all models. You can run these tests using either of these methods:
+
+1. Using the make command:
+```bash
+make test-adapter-method-subset subset=lora
+```
+
+2. Directly executing from the test directory:
+```bash
+cd tests/test_methods
+pytest -m lora
+```
+
+Both approaches will execute all LoRA tests across every model in the adapters library.
+
+## Adding a New Adapter Method to the Test Suite
+
+The modular design of the test base simplifies the process of adding tests for new adapter methods. To add tests for a new adapter method "X", follow these steps:
+
+1. Create the Test Implementation:
+   Create a new file `tests/test_methods/method_test_impl/peft/test_X.py` and implement the test mixin class:
+
+   ```python
+   @require_torch
+   class XTestMixin(AdapterMethodBaseTestMixin):
+
+       default_config = XConfig()
+
+       def test_add_X(self):
+           model = self.get_model()
+           self.run_add_test(model, self.default_config, ["adapters.{name}."]) 
+
+       def ...
+   ```
+
+2. Register the Test Mixin:
+   Add the new test mixin class to `tests/test_methods/generator.py`:
+
+   ```python
+   from tests.test_methods.method_test_impl.peft.test_X import XTestMixin
+
+   def generate_method_tests(model_test_base, ...):
+       """ Generate method tests for the given model test base """
+       test_classes = {}
+
+       @require_torch
+       @pytest.mark.core
+       class Core(
+           model_test_base,
+           CompabilityTestMixin,
+           AdapterFusionModelTestMixin,
+           unittest.TestCase,
+       ):
+           pass
+
+       if "Core" not in excluded_tests:
+           test_classes["Core"] = Core
+
+       @require_torch
+       @pytest.mark.X
+       class X(
+           model_test_base,
+           XTestMixin,
+           unittest.TestCase,
+       ):
+           pass
+
+       if "X" not in excluded_tests:
+           test_classes["X"] = X   
+   ```
+
+    The pytest marker enables execution of the new method's tests across all adapter models using:
+    ```bash
+    make test-adapter-method-subset subset=X
+    ```
+
+    If the new method is incompatible with specific adapter models, you can exclude the tests in the respective `test_on_xyz.py` file:
+
+    ```python
+    method_tests = generate_method_tests(BartAdapterTestBase, excluded_tests=["PromptTuning", "X"])
+    ```
+
+    Note: It is recommended to design new methods to work with the complete library whenever possible. Only exclude tests when there are unavoidable compatibility issues and make them clear in the documenation.
diff --git a/tests/models/__init__.py → tests/fixtures/__init__.py b/tests/models/__init__.py → tests/fixtures/__init__.py
diff --git a/tests/methods/__init__.py b/tests/methods/__init__.py