NURS – Neural Machine Translation for Under-Resourced Scenarios
The project focuses on neural machine translation (NMT) for under-resourced scenarios, i.e. languages or technical domains, using sequence-to-sequence models. To improve the translation quality within under-resourced scenarios, the work will involve data curation due to the code-mixing phenomena within the training data, terminology identification as well as linguistic injunction into the neural models.
This project is co-funded by the European Regional Development Fund (ERDF) under Irelands’s European Structural and Investment Fund Programmes 2014-2020.
- D1.1 Data discovery
- D1.2 Data Curation – D1.3 Reutilisation
- D1.4 Synthetic Data
- Deliverable D2 and D3
- D2.1 Linguistic input feature
- D2.2 Named Entity identification
- D3.1 Terminology identification for NMT
- D3.2 Terminology injection into NMT
- D4.1 Multi-way models for under-resourced languages