Skip to content

ACL 2022 (Findings): Striking a Balance: Alleviating Inconsistency in Pre-trained Models for Symmetric Classification Tasks

License

Notifications You must be signed in to change notification settings

ashutoshml/alleviating-inconsistency

Repository files navigation

Striking a Balance: Alleviating Inconsistency in Pre-trained Models for Symmetric Classification Tasks

Source code for ACL 2022 Findings paper: Striking a Balance: Alleviating Inconsistency in Pre-trained Models for Symmetric Classification Tasks

Image

Dependencies

  • compatible with python 3.9
  • dependencies can be installed using requirements.txt
  • model has been tested in multi-gpu setup also, please use CUDA_VISIBLE_DEVICES=0,1,2,3 and -n_gpus 4 incase of 4 gpus.

Dataset

Download the datasets from the following url:

Datasets

Note: While the link should be easily accessible, please refresh the page or press back in the browser and then forward, incase it asks for login.

Setup

To get the project's source code, clone the github repository:

$ git clone https://github.com/ashutoshml/alleviating-inconsistency.git

Install Conda:

Conda Installation

Create and activate your virtual environment:

$ conda create -n venv python=3.9
$ conda activate venv

Install all the required packages:

$ pip install -r requirements.txt

Training the base models on qqp, paws, mrpc

For Consistency Model

CUDA_VISIBLE_DEVICES=0 TOKENIZERS_PARALLELISM=True python src/training.py -dataset qqp-new -add_ds paws mrpc-new -model roberta -model_type dual -lr 2e-5 -additional_cls -div kl -seed 42 -augment_reverse -tbs 12 -n_gpus 1 -s_off -maxe 2

For Standard Single Model

CUDA_VISIBLE_DEVICES=0 TOKENIZERS_PARALLELISM=True python src/training.py -dataset qqp-new -add_ds paws mrpc-new -model roberta -model_type single -lr 4e-5 -additional_cls -seed 42 -augment_reverse -tbs 12 -n_gpus 1 -s_off -maxe 4

Please check that the models should be saved in the folder Models. Pick specific model directory inside the Models directory for checkpoint finetuning. Let's call that Models/ckptdir

Fine-Tuning on datasets

The models can be fine-tuned on any of the following datasets .

  1. paws
  2. mrpc-new
  3. sst2-eq: Train Size: 61393; Validation size: 872; Test Size: 5956
  4. sst2-new: Train Size: 60615; Validation size: 872; Test Size: 6734
  5. rte-eq
  6. qnli-eq

Note: The original sst2-eq data files got corrupted and had to be regenerated. We provide two new versions sst2-eq and sst2-new as replacements. The final performance of the model should, ideally, remain unchanged.

Replace any of the <dataset> mentioned above in the command below:

CUDA_VISIBLE_DEVICES=0 TOKENIZERS_PARALLELISM=True python src/finetuning.py -ckpt <Models/ckptdir> -lr 2e-5 -dataset <dataset> -tbs 12 -maxe 3 -n_gpus 1

No additional fine-tuning is required for qqp-new dataset. Classification can be performed directly.

Classification

After fine-tuning the following folders will get created Models/ckptdir/finetune

For sst2-eq, rte-eq, qnli-eq

CUDA_VISIBLE_DEVICES=0 TOKENIZERS_PARALLELISM=True python src/classification.py -dataset <dataset> -ebs 256 -ckpt <Models/ckptdir/finetune> -n_gpus 1

For mrpc-new, paws

CUDA_VISIBLE_DEVICES=0 TOKENIZERS_PARALLELISM=True python src/classification.py -dataset <dataset> -ebs 256 -ckpt <Models/ckptdir/finetune> -econs -n_gpus 1

For qqp-new (Since no fine-tuning was done for qqp-new, see that the checkpoint name points to Models/ckptdir)

CUDA_VISIBLE_DEVICES=0 TOKENIZERS_PARALLELISM=True python src/classification.py -dataset <dataset> -ebs 256 -ckpt <Models/ckptdir> -econs -n_gpus 1

The final prediction scores will be available in the <Models/ckptdir/finetune/<dataset>> file.

Please see additional arguments in src/args.py for experimentation. Evaluation on reverse candidates can be permitted through the -erev argument for symmetric classification datasets.

Evaluation

Reference Generation

First generate the reference files for each of the dataset and model for comparison using the following command:

python src/create_references.py -ckpt <Models/ckptdir> -dataset <dataset> -ebs 512

For generation of all reference files, please use:

python create_reference_json.py -modeldir Models

This will generate a file called precommands.json in the main directory. Subsequently use the following command to generate all relevant references.

python create_all_references.py -i precommands.json

Final Evaluation

For final evaluation run:

python src/evaluation.py -pretrain_path Models

The final evaluation results can be accessed via FinalResults.csv generated.

Citation

Please cite the following paper if you find this work relevant to your application

@inproceedings{kumar-joshi-2022-striking,
    title = "Striking a Balance: Alleviating Inconsistency in Pre-trained Models for Symmetric Classification Tasks",
    author = "Kumar, Ashutosh  and
      Joshi, Aditya",
    booktitle = "Findings of the Association for Computational Linguistics: ACL 2022",
    month = may,
    year = "2022",
    address = "Dublin, Ireland",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2022.findings-acl.148",
    pages = "1887--1895",
}

For any clarification, comments, or suggestions please create an issue or contact [email protected]

About

ACL 2022 (Findings): Striking a Balance: Alleviating Inconsistency in Pre-trained Models for Symmetric Classification Tasks

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages