Posts

Showing posts from October, 2016

10 Myths About Computer-Assisted Translation

Image
This is a guest post by "Vova" from SmartCAT . I connected with him on Twitter and learned about the SmartCAT product which I would describe as a TMS + CAT tool with a much more substantial collaboration framework than most translation tools I know of. It is a next-generation translation management tool that enables multi-user collaboration at a large project level, but also allows individual freelancers to use it as a simple CAT tool. It has a new and non-standard approach to pricing which I am still trying to unravel. I have talked to one LSP customer, who was very enthusiastic about his user experience and stressed the QA and productivity benefits especially for large projects. I am still researching the product and company (which has several ex-ABBYY people but they seem eager to develop a separate identity) and will share more as I learn more. But on first glance, this product looks very interesting and even compelling, even though, they, like Lilt, have a hard time des

SYSTRAN Releases Their Pure Neural MT Technology

Image
SYSTRAN announced earlier this week that they are doing a “first release” of their Pure Neural™ MT technology for 30 language pairs. Given how good the Korean samples that I saw were, I am curious why Korean is not one of the languages that they chose to release. "Let’s be clear, this innovative technology will not replace human translators. Nor does it produce translation which is almost indistinguishable from human translation"  ...  SYSTRAN BLOG The languages pairs being initially released are 18 in and out of English, specifically EN<>AR, PT-BR, NL, DE, FR, IT, RU, ZH, ES  and 12 in and out of French  FR<>AR, PT-BR, DE, IT, ES, NL. They claim these systems are the culmination of over 50,000 hours of GPU trainings but are very careful to say that they are still experimenting and tuning these systems and that they will adjust them as they find ways to make them better. They have also enrolled ten major customers in a beta program to validate the technology at

The Importance & Difficulty of Measuring Translation Quality

Image
This is another, timely post describing the challenges of human quality assessment by Luigi Muzii. As we saw from the recent deceptive Google NMT announcements that while there is a lot of focus on new machine learning approaches we are still using the same quality assessment approach of yesteryear: BLEU. Not much has changed. It is well understood that t his metric is flawed but there seems to be no useful replacement coming forward. This necessitates that some kind of human assessment also has to be made and invariably this human review is also problematic. The best practices for these human assessments that I have seen are at Microsoft and eBay. The worst at many LSPs and Google. The key to effective procedures seems to be, the presence of invested and objective linguists on the team, and a culture that has integrity and rigor without the cumbersome and excessively detailed criteria that the "Translation Industr

Real and Honest Quality Evaluation Data on Neural Machine Translation

Image
 I just saw a Facebook discussion on the Google NMT announcements that explores some of the human evaluation issues. And thought I would add one more observation to this charade before I highlight a completely overlooked study that does provide some valuable insight into the possibilities of NMT ( which I actually believe are real and substantial ) even though it is done in a  "small-scale" University setting. Does anybody else think that it is strange, that none of the press and the journalists that are gushing about the "indistinguishable from human translation" quality claimed by Google, did not attempt to run even a single Chinese page through the new super duper GNMT Chinese engine?   Like this post for example where the author seems to have swallowed the Google story, hook, line, and sinker. There are of course 50 more like this. It took me translating just one page to realize that we are really knee deep in bullshit, as I had difficulty getting even a gist

Feedback on the Google Neural MT Deception Post

Image
There was an interesting discussion thread in Reddit about the Google deception post with somebody with the alias oneasasum that I thought was worth highlighting here, since it was the most coherent criticism of my original post . Google makes MASSIVE progress on Machine Translation -- "We show that our GNMT system approaches the accuracy achieved by average bilingual human translators on some of our test sets." This is a slightly cleaned up version of just our banter from the whole thread that you can see at the link above which also has other fun comments: KV: Seriously exaggerated -- take a look at this for more accurate overview The Google Neural Machine Translation Marketing Deception ow.ly/Ii57304JV6S HE: You should have also posted this article, as you did on another Reddit forum: https://slator.com/technology/hyperbolic-experts-weigh-in-on-google-neural-translate/ That's a much better take, in my opinion. .... I saw the blog posting myself the other day.