Similarity-Augmented Prediction Methods for Neural Machine Translation
Speaker: Julius Cheng, University of Cambridge.
In natural language, there are usually many ways to say the same thing: the answer to a question can be said multiple ways, and there are many good translations of the same sentence. Training language models (LMs) trained with maximum likelihood estimation on large and diverse corpora leads to issues such as high entropy and poorly calibrated probabilities.
There is a growing body of work that addresses this by analyzing distributions in terms of semantic space rather than token space by measuring similarities between possible outputs. In this talk, I make the case for and present current progress on these "similarity-augmented methods", including my own work on 1) minimum Bayes risk for prediction, 2) similarity-sensitive entropy for uncertainty quantification, and 3) Bayesian optimization + Gaussian process regression for reranking.
This event is organised by Computational Linguistics Seminar. Attend live or online on Zoom.