Skip to content

codes and pre-trained models of paper "Segatron: Segment-aware Transformer for Language Modeling and Understanding"

Notifications You must be signed in to change notification settings

rsvp-ai/segatron_aaai

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Segatron

License: MIT

This repo contains codes and pre-trained models for our paper

Segatron: Segment-aware Transformer for Language Modeling and Understanding

He Bai, Peng Shi, Jimmy Lin, Yuqing Xie, Luchen Tan, Kun Xiong, Wen Gao, Ming Li

AAAI 2021

Setup

To use this repo, please install NVIDIA APEX. We recommand using this docker or building your own environment with NGC's PyTorch container nvcr.io/nvidia/pytorch:20.03-py3.

Download Checkpoints

We have uploaded following checkpoints to the huggingace models:

Pre-training

Evaluation

1. Wikitext-103

image-20201213122915570

2. GLUE and Machine Reading Comprehension

  • The source code is in the transformers folder, which is based on huggingface's Transformers repository. It should be notice that Segatron needs paragraph position index, sentence position index, and token position index in its input features. Hence we changed the input feature extraction and model forward functions of Transformers, which means our codes is not compatiable with the huggingface's Transformers.

  • Please refer to transformers/README.md for details.

image-20201213122906064

3. SST

image-20201213122841703

Citation

Please cite the AAAI 2021 paper:

@inproceedings{bai2021segatron,
  title={Segatron: Segment-Aware Transformer for Language Modeling and Understanding},
  author={Bai, He and Shi, Peng and Lin, Jimmy and Xie, Yuqing and Tan, Luchen and Xiong, Kun and Gao, Wen and Li, Ming},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={35},
  number={14},
  pages={12526--12534},
  year={2021}
}

About

codes and pre-trained models of paper "Segatron: Segment-aware Transformer for Language Modeling and Understanding"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages