Convolutional Neural Networks are used to capture low level features of each frame following that Recurrent Neural Networks are used to capture higher level relationships
Temporal Convolutional Networks is a unified approach that hierarchically captures information at low, intermediate, and high level time scales.