This code accompanies the paper: Sentence Embeddings using Supervised Contrastive Learning
Danqi Liao.
This is a final project for COS 584.
python bert_sent_embed.py --pos_num -1 --neg_num -1 --use_SCL
python bert_sent_embed.py --load_data_from_disk --pos_num -1 --neg_num -1 --use_SCL
To evaluate trained model on downstream sentence tasks through SentEval
cd SentEval-master/examples/
Modify 'sentbert_eval.py' to change $MODEL_PATH to your model
Modify 'sentbert_eval.py' to change $transfer_tasks to the tasks you want to evaluate
- Run the evaluation script
python sentbert_eval.py
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
model = SentBert(512*3, 3, tokenizer)
model.load_state_dict(torch.load(model_path, map_location=device))
embedding = model.encode("hello world.")
Evaluation results on Avg Glove embeddings, our SBERT baseline, all positive/negative SCL model with lambda 0.3
Tables | STS(12-16) AVG | Sentence Transfer Tasks AVG |
---|---|---|
Avg. GloVe Embeddings | 44.98 | 74.27 |
Our SBERT baseline | 67.61 | 75.56 |
allpalln-lambda0.3-SCL | 70.44 | 76.16 |
Note, our SBERT baseline is not the full scale SBERT model from SBERT, but rather our own replementation using only SNLI data, medium-sized 8/512 bert, and the same hyperparameters with the SCL models. The reason for using a smaller sized bert and only SNLI data is simply computation constraints.