On the Interpretability of Attention Networks
Attention mechanisms form a core component of several successful deep learning architectures, and are based on one key idea: “The output depends only on a small (but unknown) segment of the input.” In several practical …
