Skip to content

Latest commit

 

History

History
13 lines (8 loc) · 809 Bytes

README.md

File metadata and controls

13 lines (8 loc) · 809 Bytes

Dilated, Residual, Gated CNN

This is a PyTorch implementation of the network presented in Chang et al "Temporal Modeling Using Dilated Convolution and Gating for Voice-Activity-Detection" 2018 Link to paper

The network is used for Voice Activity Detection (VAD) in the paper

Network Architecture

The core network arcitecture can be seen in the drawing below Architecture

The original paper does not state how they do the dimension matching and flattening to the fully connected layer in the end of the network. For the dimension matching, simple 2D convolutions were used. For the flattening, two consecutive 1x1 convolutions were used before flattening to the fully connected layer.