Khalil Sima'an: a translation machine that really understands language
How do you translate the Dutch word gezellig into English? Cosy is probably the first thing that comes to mind but it doesn’t really convey the full meaning. And take the Italian word culaccino, for example, which refers to the mark that a cold glass leaves on a table. Or the Portuguese word saudade, which can best be described as a combination of melancholy and yearning.
These are words which only exist in a specific language and which, as a result, are extremely hard to translate. People who have these words in their mother tongue understand what they mean because of the experiences they have had in that language. If you are Dutch you will, for example, have come across countless situations that were described as gezellig.
‘Understanding a language is a process that is embedded in a whole host of other experiences that we as humans experience,’ says Leader of Statistical Language Processing and Learning lab Khalil Sima'an.
The answer lies in our experiences. Sima'an gives an example. ‘Take a bilingual four year old, for example. They have no problems translating for their grandma even though they haven’t specifically learnt to do so. This is because the child relates language to the experiences that they have had in their life and have learnt the words that relate to them in two languages'. And that is precisely where translation machines like Google Translate still fall short.
A translation machine that really understands language
The reason why Google Translate sometimes totally misses the mark is that the translation machine actually doesn’t really understand what it is translating. ‘Google Translate isn’t a translator but it’s pretty good at imitating one.' For example, at the moment, Google Translate doesn’t take into account the non-textual context where the sentence will be used when it’s translating. And that’s what Sima’an and his colleagues want to work on over the next few years.
Sima'an and his colleagues want the translation machines of the future not only to imitate translators but also to have the translation capabilities of a bilingual child. ‘In other words, I’m working on the basic principles for a future machine which can translate without having seen millions of examples of translated texts.
But how do you ensure that a robot really understands what certain linguistic expressions mean? You have to expose it to as many human experiences as possible.
‘Essentially, machines have to become a bit like a person and operate as much as possible among people. Because language is an expression of who we are as people.'
You can expose machines to one form of human experiences with the help of images and movies. Take ‘Romantic love, for example. You can let a robot experience this by showing it a film or play of Shakespeare’s Romeo and Juliet. That way the robot will learn that the concept of romantic love is often associated with things like blushing, certain glances or certain expressions.’
To enable further research in this field in the years ahead, Khalil has joined forces with his colleagues in Computer Vision.