End-to-end Crosslingual Image Description

Course Description

In this project, you’ll create models that generate descriptions of images given two sources of context: the image itself and a description of the image in another language.

Day 1: Lecture on Introduction to Describing Images in Natural Language
Practical Session on Implementing Image Description Models in Keras

Day 2: Lecture on Visual attention and Crosslingual Models
Practical Session on Crosslingual Image Description Models

Day 3: Other Topics in Multimodal Natural Language Processing Practical Session on Crosslingual Image Description Models


Python programming, Neural networks, some exposure to Natural Language Processing and Computer Vision.


Desmond Elliott
Postdoctoral Researcher, Institute for Logic, Language and Computation, University of Amsterdam

My main research interests are models and evaluation methods for automatic image description generation. Recently, I co-organised the first shared task on Multimodal Machine Translation in 2016, and have delivered tutorials on Automatic Image Description at the 1st Integrating Vision and Language Summer School, and on Multimodal Learning and Reasoning at the 54th Annual Meeting of the Association of Computational Linguistics.