Neural machine translation for low-resource conditions
OA Version
Citation
Abstract
Neural Machine Translation (NMT) has seen significant advances in recent years and many efforts have succeeded in creating efficient and trustworthy NMT models which perform remarkably well. Yet, many issues such as lack of monolingual or parallel data for certain Languages and Language Pairs, and constraints in compute resources call for further analysis of the NMT pipeline, to understand model behavior and how different methods affect NMT results; and for a focus in the development of bilingual and multilingual models and data augmentation techniques. Our research aims to enhance the performance of NMT models in low-resource conditions by unifying multiple strategies to address these challenges comprehensively.To this end, we present a series of approaches, which attempt to improve NMT results in a wide range of Low-resource scenarios: 1. Firstly, we develop a low-resource NMT pipeline that leverages code-switching and comparable data extraction. Utilizing unsupervised, semi-supervised, and supervised training methods, we substantially improve translations for under-represented languages like Gujarati, Somali, and Kazakh when paired with English. 2. Building on these technical advances, we then conduct an empirical analysis focused on French and Gujarati translations to and from English. This investigation not only benchmarks the performance of unsupervised and supervised NMT models but also delves into model behavior, output quality, and robustness. 3. The insights gained previously inform our third approach, where we introduce an explainability-based method specifically tailored for low-resource NMT settings. 4. We extend the low-resource paradigm from the bilingual to a multilingual setup, using a Transformer-based multilingual and conditional computation-inspired model, namely a Task-level Mixture of Experts model, to boost results in Direct (non-English) NMT of a large number of Language Pairs. Our work provides a valuable understanding of NMT and lays the ground for further expansion of proposed methods to other languages and low-resource conditions.
Description
2023
License
Attribution 4.0 International