This repository contains a implementation for the paper A Structured Self-Attentive Sentence Embedding, which was published in ICLR 2071.
Here is the environment we need:
- Python(3.8)
- Pytorch(1.12.1+cuda11.3)
- torchtext(0.6.0) (very important)
- spacy(3.4.1)
- en-core-web-sm(3.4.0)
We train the model on two open datasets: PAN15 Author Profiling for the task of Author Profiling and Yelp for the task of Sentiment Analysis, but only achieved close to the performance as the paper said on the Yelp dataset.
Models | Yelp | Age |
---|---|---|
BiLSTM + Max Pooling + MLP | 61.75% | 60.35% |
CNN + Max Pooling + MLP | 58.13% | 60.69% |
Our Model | 63.00% | 61.90% |
For author profiling task:
python train.py --model_type SelfAttention --dataset Age --Age_train_path 'your trainset path' --Age_test_path 'your testset path'
And for sentiment analysis task:
python train.py --model_type SelfAttention --dataset Yelp --Yelp_train_path 'your trainset path' --Yelp_test_path 'your testset path'
Here are some useful options for the train script.
Options | Optional values | Meanings |
---|---|---|
--model_type | SelfAttention | BiLSTM | CNN | Which model we use for training |
--dataset | Age | Yelp | Which task(dataset) we use for training |
--seed | The random seed. | |
--LSTM_hidden | The out dimension of LSTM for model SelfAttention or BiLSTM | |
--MLP_hidden | The hidden units of the classifier for the downstream application. | |
--aspects | The hyperparameter |
|
... | ... | ... |
Here is a pretrained weight(code: 8smh) for the Yelp dataset.
For one single sentence's visualization, execute the following command:
python visualization.py --sentence 'your sentence' --muti_sentence False
And for a text file including multiple sentences as input, execute:
python visualization.py --muti_sentence True --path 'your text file path'
Some of the visualization results are as follows. The reason why the visualization is not as significant as in the paper might be that the algorithms are different. At least the visualization algorithm used in this repository is not likely to have so many dark red areas. (Do not take this seriously:-D