Dialog | Wadhwani School of Data Science and Artificial Intelligence

Improving Dialog Evaluation with a Multi-reference Adversarial Dataset and Large Scale Pretraining

Publications

There is an increasing focus on model-based dialog evaluation metrics such as ADEM, RUBER, and the more recent BERT-based metrics. These models aim to assign a high score to all relevant responses and a low score to all …

Tags: NLP, dialog, pretraining, BERT