Deep learning models have emerged as a powerful tool in avian bioacoustics to assess environmental health. To maximize the potential of cost-effective and minimal-invasive passive acoustic monitoring (PAM), models must analyze bird vocalizations across a wide range of species and environmental conditions. However, data fragmentation challenges a evaluation of generalization performance. Therefore, we introduce the
You can use the devcontainer configured as as git submodule:
git submodule update --init --recursive
With poetry.
poetry install
poetry shell
Foundation Models are tested on the Benchmark of Animal Sounds (BEANS) which we host on Huggingface and we focus on the classification datasets (watkins bats, cbi, dogs & humbugdb). Using the beans.sh script you can specify one or multiple experiment Paths to execute linear probing on all the BEANS datasets:
$./projects/biofoundation/scripts/run_beans_embeddings_experiments.sh embedding/BEANS/perch [additional experiments]
Currently the available embedding experiments are:
They all inherit from the base configuration embedding_config.yaml where most changes for extracting Embeddings are set. To execute an experiment on a specific dataset you have to change the following lines in the experiment file:
datamodule:
dataset:
dataset_name: beans_watkins # Change
hf_path: DBD-research-group/beans_watkins # Change
hf_name: default
n_classes: 31 # Change
dataset_name | n_classes |
---|---|
beans_watkins | 31 |
beans_bats | 10 |
beans_cbi | 264 |
beans_dogs | 10 |
beans_humbugdb | 14 |
Regarding the embedding extraction multiple things can be configured by changing the params of the embeddings_datamodule.py for example through the experiment config:
defaults:
# Inherit from default embedding config
- biofoundation/embedding/BEANS/embedding_config.yaml
# Use Hubert for embedding extraction
- override /datamodule/embedding_model: ../../module/network/hubert.yaml
datamodule:
# If >0 only X samples per class are used for training; The rest is used for validation and testing
k_samples: 0
# If a validation set should be used: Use null to use val set and 0 for no validation at all
val_batch: null
# (If 0 and k_samples > 0 then all remaining samples land in test set; If k_samples = 0 val and test split in BEANS are combined in the test set)
# Test/Validation_ratio if k_samples > 0
test_ratio: 0.5
# BEANS provides a low_train split which can be used instead of the default train split
low_train: False
# If embeddings should be averaged or if just the first seconds should be used
average: True
The classifier can also be changed and right now this is used.
The same models can also be finetuned and the experiments can be found in the respective folder (except Perch). For finetuning a much lower learning rate is recommended and the finetune_module is used.
Compared to linear probing embeddings can't be computed beforehand which is why the computation times are considerably longer. To reduce these a bit, a hybrid method can be used that first applies linear probing and then a few epochs of finetuning. The results are usually better than linear probing but worse than finetuning. ATM the embeddings are not computed beforehand for the linear probing phase but the hybrid approach is still faster.
For this the hybrid_module is used and the experiments can be found in the hybrid folder
The results folder contains plots and plot-code that gives insights on the different performance between linear probing (blue), finetuning (orange) and the hybrid(green) method.
As a reference the embedding results can be used for future work:
from birdset.datamodule.base_datamodule import DatasetConfig
from birdset.datamodule.birdset_datamodule import BirdSetDataModule
# initiate the data module
dm = BirdSetDataModule(
dataset= DatasetConfig(
data_dir='data_birdset/HSN', # specify your data directory!
dataset_name='HSN',
hf_path='DBD-research-group/BirdSet',
hf_name='HSN',
n_classes=21,
n_workers=3,
val_split=0.2,
task="multilabel",
classlimit=500,
eventlimit=5,
sampling_rate=32000,
),
)
# prepare the data (download dataset, ...)
dm.prepare_data()
# setup the dataloaders
dm.setup(stage="fit")
# get the dataloaders
train_loader = dm.train_dataloader()
from lightning import Trainer
min_epochs = 1
max_epochs = 5
trainer = Trainer(min_epochs=min_epochs, max_epochs=max_epochs, accelerator="gpu", devices=1)
from birdset.modules.multilabel_module import MultilabelModule
model = MultilabelModule(
len_trainset=dm.len_trainset,
task=dm.task,
batch_size=dm.train_batch_size,
num_epochs=max_epochs)
trainer.fit(model, dm)
Logs will be written to Weights&Biases by default.
To enhance model performance we mix in additional background noise from downloaded from the DCASE18. To download the files and convert them to the correct format, run the notebook 'download_background_noise.ipynb' in the 'notebooks' folder.
Our experiments are defined in the configs/experiment
folder. To run an experiment, use the following command in the directory of the repository:
python birdset/train.py experiment="EXPERIMENT_PATH"
Replace EXPERIMENT_PATH
with the path to the disired experiment YAML config originating from the experiment
directory. For example, here's a command for training an EfficientNet on HSN:
python bridset/train.py experiment="local/HSN/efficientnet.yaml"
Our datasets are shared via HuggingFace Datasets in our BirdSet repository. First log in to HuggingFace with:
huggingface-cli login
For a detailed guide to using the BirdSet data pipeline and its many configuration options, see our comprehensive BirdSet Data Pipeline Tutorial.
The datamodules are defined in birdset/datamodule
and configurations are stored under configs/datamodule
.
base_datamodule
is the main class that can be inherited for specific datasets. It is responsible for preparing the data in the function prepare_data
and loading the data in the function setup
. prepare_data
downloads the dataset, applies preprocessing, creates validation splits and saves the data to disk. setup
initiates the dataloaders and configures data transformations.
The following steps are performed in prepare_data
:
- Data is downloaded from HuggingFace Datasets
_load_data
- Data gets preprocessed with
_preprocess_data
- Data is split into train validation and test sets with
_create_splits
- Length of the dataset gets saved to access later
- Data is saved to disk with
_save_dataset_to_disk
The following steps are performed in setup
:
- Data is loaded from disk with
_get_dataset
in which the transforms are applied
Data transformations are referred to data transformations that are applied to the data during training. They include e.g. augmentations. The transformations are added to the huggingface dataset with set_transform
.