TransQuest: Translation Quality Estimation with Cross-lingual Transformers
The goal of quality estimation (QE) is to evaluate the quality of a translation without having access to a reference translation. High-accuracy QE that can be easily deployed for a number of language pairs is the missing piece in many commercial translation workflows as they have numerous potential uses. They can be employed to select the best translation when several translation engines are available or can inform the end user about the reliability of automatically translated content. In addition, QE systems can be used to decide whether a translation can be published as it is in a given context, or whether it requires human post-editing before publishing or translation from scratch by a human. The quality estimation can be done at different levels: document level, sentence level and word level.
With TransQuest, we have opensourced our research in translation quality estimation which also won the sentence-level direct assessment quality estimation shared task in WMT 2020. TransQuest outperforms current open-source quality estimation frameworks such as OpenKiwi and DeepQuest.
- Sentence-level translation quality estimation on both aspects: predicting post editing efforts and direct assessment.
- Word-level translation quality estimation capable of predicting quality of source words, target words and target gaps.
- Perform significantly better than current state-of-the-art quality estimation methods like DeepQuest and OpenKiwi in all the languages experimented.
- Pre-trained quality estimation models for fifteen language pairs.
Table of Contents
- Installation - Install TransQuest locally using pip.
- Architectures - Checkout the architectures implemented in TransQuest
- Examples - We have provided several examples on how to use TransQuest in recent WMT quality estimation shared tasks.
- Pre-trained Models - We have provided pretrained quality estimation models for fifteen language pairs covering both sentence-level and word-level
- Contact - Contact us for any issues with TransQuest
- COLING Presentation done on December, 2020.
- Research Seminar done on 1st of October 2020 in RGCL and the slides.
If you are using the word-level architecture, please consider citing this paper which is accepted to ACL 2021.
1 2 3 4 5 6
1 2 3 4 5 6
1 2 3 4 5 6