Posts

Showing posts from January, 2017

The Driving Forces Behind MT Technology

Image
This is a  modified version of a post that was originally published on  Caroline Alberoni's blog. ------------------------------- Machine translation (MT) today is as pervasive and ubiquitous as mobile phone technology. While some translators still feel threatened by the technology or feel the need to disparage it for it’s less than perfect translation, it is useful to understand why it is so widely used. At their annual developer conference in April 2016, Google announced that they are translating over 140 billion words a day across 100 languages. Baidu Translate can translate 27 languages and is growing, and processes around 100 million requests every day. Most of this use is from casual internet users who may be interested in translating a news story or some simple phrases. However, there is a growing impact on the professional translation business as well. If you add the translation volume of Microsoft, Baidu, Yandex and other MT providers, we can certainly expect that more th

An Examination of the Strengths and Weaknesses of Neural Machine Translation

Image
As Neural MT  gains momentum we see more studies that explain why it is being seen as a major step forward, and we are now beginning to understand some of the very specific reasons for this momentum. This summary by Antonio Toral and Víctor M. Sánchez-Cartagena highlights how NMT provides some specific advantages using well-understood data and comparative systems. Their main findings are presented below, but I saw an additional comment in the paper that I am also including here. The paper also provides BLEU scores for all the systems that were used, which are consistent with the scores shown here, and it is interesting that Russian is still a language where rule-based systems still produce the highest scores in tests like this. The fact that NMT systems perform so well on translations going out of English should be especially interesting to the localization industry. Now we need some evidence of how NMT systems can be domain-adapted and SYSTRAN will soon provide some details. The fact

Finding the Needle in the Digital Multilingual Haystack

Image
There are some kinds of translation applications where MT just makes sense. Usually, this is because these applications have some combination of the following factors:  Very large volume of source content that could NOT be translated without MT Rapid turnaround requirement (days) Tolerance for lower quality translations at least in early stages of information review To enable triage requirements and help to identify highest priority content from a large mass of undifferentiated content Cost prohibitions (usually related to volume) This is a guest post by Pete Afrasiabi , of iQwest Information Technologies that goes into some detail into the strategies employed to effectively MT in a business application area, that is sometimes called eDiscovery, (often litigation related), but in a broader sense could be any application where it is useful to sort through a large amount of multilingual content to find high-value content . In today's world, we are seeing a lot more litigation invo