Evaluating Linguistic Style Adversarial Paraphrase Robustness of LLMs in Social and Commonsense QA

Large language models (LLMs) have made significant strides in natural language processing (NLP) tasks, but their performance needs to be assessed for linguistic variations and adversarial paraphrases. Previous works like HELM have laid the foundation for this research, but they primarily focus on local robustness (fixed transformations, spelling impurities etc.) We aim to take a broader approach, introducing global adversarial paraphrasing, which changes the style and structure of the input sentences while preserving the meaning and context.

How we do it

Given that LLMs are effective in capturing context, we use them for paraphrasing the input sentences in different styles based on demographic groups of age gender and temporal changes. We then evaluate the performance of LLMs on social and commonsense question answering (QA) tasks, using the original and paraphrased sentences as inputs. We also compare the results with baselines.

What this repository contains

This repository contains the following files and folders:

datasets:
llm-benchmark-notebooks
llm-paraphrase-notebooks
datasets
baselines
tempobert
BIG-bench

Running the code

git clone https://github.com/llm-robustness/llm-robustness.git
cd llm-robustness
pip install -r requirements.txt

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
BIG-bench		BIG-bench
baselines		baselines
datasets		datasets
llm-benchmark-notebooks		llm-benchmark-notebooks
llm-paraphrase-notebooks		llm-paraphrase-notebooks
plots		plots
tempobert		tempobert
xsbert		xsbert
.gitignore		.gitignore
Readme.md		Readme.md
create_task_subset.py		create_task_subset.py
lexhub_automate.py		lexhub_automate.py
paraphrases_cat.csv		paraphrases_cat.csv
paraphrases_gender.xlsx		paraphrases_gender.xlsx
process_out.py		process_out.py
process_txt_to_csv.py		process_txt_to_csv.py
siqa_task.json		siqa_task.json
split_csv.py		split_csv.py
task_Ambiguous.json		task_Ambiguous.json
task_Female.json		task_Female.json
task_Male.json		task_Male.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Evaluating Linguistic Style Adversarial Paraphrase Robustness of LLMs in Social and Commonsense QA

How we do it

What this repository contains

Running the code

About

Releases

Packages

Languages

caisa-lab/llm-QA-robustness

Folders and files

Latest commit

History

Repository files navigation

Evaluating Linguistic Style Adversarial Paraphrase Robustness of LLMs in Social and Commonsense QA

How we do it

What this repository contains

Running the code

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages