SYSTRAN’s Continuing Neural MT Evolution

Recently, I had the opportunity and kind invitation to attend the SYSTRAN community day event where many members of their product development, marketing, and management team gathered with major customers and partners.

The objective was to share information about the continuing evolution of their new Pure Neural MT (PNMT) technology, share detailed PNMT output quality evaluation results, and provide initial customer user experience data with the new technology. Also, naturally such an event creates a more active and intense dialogue between company employees and customers and partners. This, I think has substantial value for a company that seeks to align product offerings with its customer's actual needs.

Ongoing Enhancements of the PNMT Product Offering

The event made it clear that SYSTRAN is well down the NMT path, possibly years ahead of other MT vendors, and provided a review of the current status of their rapidly evolving PNMT technology.

Some highlights from my perspective:

Alex Rush from the Harvard NLP Department made an enthusiastic and valiant attempt to explain how NMT works, but I think lost most of the audience by the time he got to how LM/Softmax works and how dense features enable discrete predictions. Ironically the crowd was pretty glassy-eyed and stupified by the time he got to attention mechanisms, LSTMs and the thousands of hidden dimensions underlying each machine translation. However, for those who seek these kinds of details, Harvard and SYSTRAN are making much of this information available as open source via the OpenNMT project at Harvard University.

Jean Senellart, SYSTRAN CEO and CTO, described that they had already moved to the second generation production versions of the PNMT engines that had been released in October, and that they continued to see meaningful improvements in output quality, even though they are not using Internet-scale training data volumes.

He also announced 15 new language pairs that have been completed to make for a total of more than 45 language pairs.These new languages include French <> Chinese, English <> Russian, English <> Hungarian, English <> Hebrew, among others.

He pointed out how they had overcome one of the major hurdles of deploying NMT technology: the real-time translation throughput issue. SYSTRAN has found a way to deliver a production PNMT engine that runs only slightly less than 20% slower than their current hybrid (PB-SMT) systems.

Expert Human Evaluation Of PNMT Output

There was a very interesting presentation by Heidi Depraetere of CrossLang who is running a comprehensive human and automatic evaluation of ALL the PNMT systems output results. She is one of the few in my opinion, who can do a meaningful and accurate evaluation that will stand serious scrutiny and provide true insight into the MT output quality from a professional translator perspective. Her personal enthusiasm about the PNMT technology was quite revealing for me, as she has been around, and has an “MT Reality Meter” that I trust. Among other things, Heidi reported that:

There is a global improvement on all standard evaluation metrics like BLEU, TER, RIBES for a variety of languages with some examples shown below.

For English to French, evaluators clearly preferred SYSTRAN NMT when compared with the output from online web-scale SMT engines (Google, Bing).
She presented evidence on the strong preference by evaluators for PNMT output from an IT domain-adapted EN > KO engine. When comparing PNMT and professional translation:
- For 38% of the sentences, NMT is preferred
- For 54% of the sentences, Human Translation is preferred
- For 18% of the sentences, PNMT translation is judged equivalent to Human. This means that 1 in 2 sentences produced by the PNMT engine was either preferred or seen to be as good as professional human translation.

She also provided some very interesting data on the types of errors by language, which is too detailed to go into here, but was quite revealing of data bias and other data related problems. Given the scale of the evaluation, more tests are still underway, and SYSTRAN will share this information as it becomes available.

Customer User Experience Findings

Several customers also presented their use experience, with several LSPs vouching for the PNMT improvements over previous systems e.g. see Lionbridge comments below, or here is a gushing report from Lexcelera. The most interesting use case study (for me) was presented by the travel guide publisher, Petit Futé, who can now custom publish-on-demand, heavily personalized and unique one-of-kind travel guides that may draw data from several source languages, into a selected target language, at a customer accepted quality level, driven by PNMT. This is something they call Augmented Tourist 2.0, which allows a customer to create a unique travel plan, and then print a custom travel guide book to provide specific information just for that unique trip, aggregating both user review and generic tourist information. This is an approach that could also be used by other kinds of popular specialized domain publishers like Romance Novellas, Hatha Yoga Manuals, Multi User Online Gaming Guides etc..

Near Term Improvements Coming in 1H2017

Jean later showed a brief demo of how PNMT also has some of the capabilities of Interactive MT/AutoSuggest that competitive products have, that he called Predictive Translation, and described additional capabilities to handle unknown words which is a major concern for many in their initial explorations of Neural MT.

In the first quarter of 2017, the SYSTRAN PNMT engines will incorporate the broad infrastructural complementary functionality that is available to all SYSTRAN MT engines. Customers will benefit from the full power of this new engine in their current solution platform with all the same functionality, such as processing many file formats, customization with user dictionaries and translation memories, real time translation, named-entity recognition and integration of the engine into the Microsoft Office tools.

SYSTRAN will also work to transfer the compute intensive cloud PNMT system onto a somewhat scaled down translation server, to enable on-premise server installation and even mobile phone implementations, hopefully without compromising translation quality too much.

Domain Adaptation and Specialization

One of the major criticisms of NMT is that it is not ready for professional business use because it cannot be customized, or domain-adapted, for each enterprise customer like the most successful PB-SMT systems are today. Critics say that NMT is only a technology for generic system use. The training process is so expensive and slow that it is not feasible to use NMT for enterprise systems today, say the critics. However, SYSTRAN plans to do exactly that over the next few months, and has already begun beta testing as the EN>KO IT system tests described above suggest. NMT offers new approaches to customization that can be quick and not require the slow and expensive initial training period. This is not PB-SMT and new things are possible.

With its new PNMT engine, SYSTRAN can optimize neural networks in a new post-training process called “specialization”. Think of this as fast engine customization for unique customer or project needs, almost like Adaptive MT. This method significantly improves the quality of translation in record time.

Jean Senellart, SYSTRAN CEO, explains: “Adaptation of translation to a specific domain such as legal, marketing, technical, pharmaceutical, is an absolute necessity for global companies and organizations. Offering professionals specialized translation solutions in their trade terminology has been part of SYSTRAN's DNA for many years. This new generation of neural engines offers new capabilities in domain adaptation. PNMT is able to adapt a generic model to new data and even to each translator. Generic NMT is undoubtedly a great improvement in translation technology, but “Specialized NMT” is the technology that will truly help companies meet their global challenges.”

Impact on MT Market Dynamics

The growing feedback on the significantly better NMT Google engines also suggests that a change is coming. While SYSTRAN is easily 12 to 18 months ahead of most other MT vendors, with any possibility of delivering a market-ready Neural MT product, it is now becoming increasingly clear that the whole Expert MT Vendor market will start to move towards Neural MT. But the shift to NMT is expensive and complex and most other MT vendors do not have the funding, manpower and expertise to jump into NMT implementation easily. Having an alliance with an academic partner and funding an occasional experiment is not enough.

Open source will ease the hurdles and interestingly SYSTRAN is aiding this through the Harvard OpenNMT project, but the funding needs to properly undertake NMT development will keep this a game only for the big boys and the really smart small ones. Hopefully we do not see the same DIY foolishness we saw with SMT and Moses, as this is not a proper realm for LSPs to play in, even for very large ones. In my opinion the only one who could do this competently is SDL. Ask any one of the real NMT experts to explain how NMT works, if you have any doubt about my skepticism.

While I still do believe that Adaptive MT will produce the highest quality MT output in the near-term (2017), I think they too are heading towards Neural MT. Unless SYSTRAN can surprise us by providing the market with a rapid, robust, high-quality specialization capability, I would still expect that a properly engaged Adaptive MT system will produce the best professional MT engines in terms of output quality and translation productivity in the near term. 2017 will probably be the last year that phrase-based SMT systems will dominate in the professional and enterprise use arena. The Google, Microsoft, FB, Naver, Baidu evidence is clear, NMT is the way forward for generic systems, with improvements that are good enough to justify huge increases in deployment hardware costs. However, if SYSTRAN solves the domain adaptation and specialization challenge, and the large vocabulary issues, at enterprise scale, sooner rather than later, I think we will see a more rapid transition even for the smaller but usually higher quality professional translation MT world.

These are exciting times again for the MT world, as now, once again we are seeing people take big strides forward. SYSTRAN also briefly presented some new product concepts that look interesting and I hope to cover that in more detail in future. BTW, I am also talking to SDL about their Adaptive MT technology and will report on that shortly.

I wish you all a happy, healthy and joyous holiday season. And a prosperous and happy new year.

TechGist05