Description
Attention mechanisms have become almost a de facto standard in sequence-based tasks such as image captioning (image to sequence of words), sentiment classification (sequence of words to label) or machine translation (sequence-to-sequence). One of the benefits of attention mechanisms is that they allow for dealing with variable sized inputs, focusing on the most relevant parts of the input to make decisions. In this presentation, we will introduce attention in the context of natural language processing, and will move towards a broader range of application domains where attention-based models have been leveraged to address the shortcomings of prior methods. In particular, we will focus on improving graph convolutional networks by exploiting (self-)attention, and set prediction.