The TAUS EPIC API was first released in October 2022. Since then, users of the EPIC API have indicated a savings between 25% and 60% on human post-editing efforts and costs, as well as helping to mitigate risks of bad translation output for high-volume users in an MT-only setting. A generic model is available off-the-shelf and via the API it can be easily integrated in existing platforms and content and translation workflows. This generic model has undergone comprehensive training sessions in order to release V2 of the TAUS EPIC API.
The TAUS Data repository (consisting of 7+ billion words in about 600 language pairs and domains) has been instrumental in training the TAUS Estimate API models. To improve the generic model, the NLP team at TAUS has pulled millions of sentences from the repository in the IT, Healthcare, Commerce, Legal and Business domains for English into French, Italian, German and Spanish and curated high-quality training sets. After extensive training and analysis, the team is happy to report considerable improvements for these languages.
So what are the improvements you can expect when you start using v2 of the TAUS EPIC API? Below are some of the important updates:
V 2.0 notices mistranslation in less obvious cases. Here is a case of polysemy in English, where Spanish has different words depending on context.
Source |
Target |
Score |
To add a cell to a table row, you use the <td> tag. |
Para agregar una celda a una fila de la tabla, usas la etiqueta <td>. |
0.92 |
To add a cell to a table row, you use the <td> tag. |
Para agregar una celda a una fila de la mesa, usas la etiqueta <td>. |
0.67 |
Often, quality estimations fail to respond to incorrect gradations. Version 2.0 picks up relatively subtle errors (in relation to sentence size and word type).
Source |
Target |
Score |
The Thundercloud Solution for E-Mail includes every option necessary for e-mail storage management designed for medium-sized businesses. |
Thundercloud Solution for E-Mail comprend toutes les options nécessaires à la gestion du stockage des e-mails et est conçu pour les grandes entreprises. |
0.82 |
The Thundercloud Solution for E-Mail includes every option necessary for e-mail storage management designed for medium-sized businesses. |
Thundercloud Solution for E-Mail comprend toutes les options nécessaires à la gestion du stockage des e-mails et est conçu pour les moyennes entreprises. |
0.91 |
Close-enough translations often go unnoticed for many types of quality estimation, that are also generally known to struggle with short sentences. Version 2.0 makes a clear distinction between 'software' and 'operating system'
Source |
Target |
Score |
hardware and software requirements |
Hardware- und Softwareanforderungen |
0.94 |
hardware and software requirements |
Hardware- und Betriebssystemanforderungen |
0.85 |
The quality standard and expectations are of course subjective, so it is up to you and your use case to decide where to draw the line of good and bad quality. However, here are some guidelines from the NLP team to interpret the scores and make decisions when using V2 of the generic model:
Version 2 of the TAUS EPIC API was released on 8 July. To switch to V2, please follow the instructions here.
Interested to try out the improved generic model? Sign up for a free trial and get access to 500,000 characters in Sandbox mode.
5 minute read