-
Notifications
You must be signed in to change notification settings - Fork 39
Home
Welcome to the hierarchical-attention-model wiki!
This repository implemented the hierarchical attention networks proposed by Zichao Yang et.al (paper: Hierarchical Attention Networks for Document Classification)
The following image is taken from their paper, which reflects the hierarchical attention network architectures:
It mainly consists of 4 parts(from bottom to top): a word sequence layer, a word-level attention layer, a sentence encoder, and a sentence-level attention layer.
Since the task(document classification) itself doesn't contain any sentence/document pairs like machine translation does, there's no natural way of forming any inter-attentions. Here an innovative way of forming intra-attention is proposed by the paper. As the figure shows, two context vectors