The Continuing Saga & Evolution of Machine Translation

I recently attended the 7th IMTT Conference in Cordoba, Argentina. I especially enjoy the IMTT events because somehow they have found a content formula that works for both translators and LSPs. You get to see the translation supply chain communicate in real-time. The overall culture of their events is also usually very collaborative, and to my mind the place to see the most open and constructive dialogue between translators and agencies. Some may not be aware that Argentina has a particularly strong concentration of skilled humans who understand the mechanics of localization (especially in FIGS BrPt), and many of the agencies, even small ones, are able to work with pretty much every TM and TMS (Translation Management System) system in the market with more than a basic level of competence. Because of historical decisions made by @RenatoBeninatto many years ago and a great university educational system, Argentina has become a place with a comprehensive and sizable professional translation eco-system.

317201_10150297433683886_254788843885_7822180_1515745_n

There were a few presentations that I found especially interesting, including a plenary presentation by Suzanne de Santamarina on the use of quality metrics. You can see some of the twitter trail here and here but basically Suzanne described her very active use of J2450 measurements to improve the dialogue on quality with customers and with her translators. While there clearly is effort and expense involved in implementing this as actively as she has, I think it dramatically improves the conversation regarding translation quality between all the participants, as it is very specific and impersonal and clear about what quality means. It is also a means to build what she called “customer delight” which of course also includes a major service component.

Quality in a product or service is not what the supplier puts in. It is what the customer gets out (of the product/service) and is willing to pay for. A product is not quality because it is hard to make and costs a lot of money, as manufacturers typically believe. This is incompetence. Customers pay only for what is of use to them and gives them value. Nothing else constitutes quality…
~ Peter Drucker

Asia Online makes a software tool available to enable customers to calculate J2450 scores for this very reason. It helps to move the discussion from inactionable complaints like “I don’t like the quality” or “the quality is not good”, to practical error identification and resolution action steps like “Is there a way to reduce frequency of the wrong terminology errors in the system?” Just as proper use of BLEU scores requires care and some expertise so does the use of J2450. Suzanne’s company’s regular and highly structured use of J2450 is such that they can really assess the linguistic quality from project to project with a precision that few have. Her approach is refreshing in its clarity and precision and quite a contrast form the meandering inconclusive discussions on “quality” that you see in LinkedIn. Tools like BLEU and J2450 depend on the skill level of the user, and require an investment of time and effort and repeated use to develop real user competence before one understands the informational insights that their use can provide.

I also enjoyed the presentations by master translators like João Roque Dias and Danilo Nogueira on the craft and art of translation, and enjoyed talking to them about MT and the life of the translator in general. (Yes, MT is sometimes useful even for some of them.) There were several skill focused presentations on Language QA tools, CAT and collaborative tools that were also very interesting. I heard great things about Val Ivonica’s presentation (in Brazilian) on translation productivity tools which I was unable to attend as it coincided with mine. It is interesting that Patricia Bown positioned MemoQ as collaboration software that enables the linear TEP model to evolve, enabling faster turnaround and higher volume. There were many Brazilians present (though some said not enough) and they lived up to their reputation for revelry but unfortunately were thwarted in their (our) attempts to find a karaoke place one evening. Nevertheless they shared their linguistically oriented humor with me and I had no difficulty finding a willing interpreter even though I was often the only person who did not speak the language.

I delivered a presentation on the emerging role of MT as a means to deal with the translation challenges created by the content explosion and new kinds of dynamic product/business related content. The feedback I received was mostly positive and constructive even though there were several very skeptical translators in the crowd. There were some in the audience who have already experienced MT that works and even those who had not worked with customized systems admitted that sometimes MT was useful. I was also on a panel on “The Future of the Industry” which got mixed reviews as some translators felt it was not relevant and others felt it was a tired topic that nobody had any real clarity on. Many feel change is coming but are not clear what this really means and unfortunately for many the end-result of these changes is that customers expect more work for less money. This does mean that there is a certain amount of apprehension amongst the attendees as the future is not quite predictable.

313481_10150297437053886_254788843885_7822203_7110892_n

A blog entry by Emily Stewart that pondered upon the theme of technology driven change at the conference a few days later, triggered an interesting and on-going discussion in LinkedIn. Her post which was about the advent of technology in a variety of different markets is thoughtful and worth reading. I also think her conclusion (shown below) is good advice for us all.

Instead of denouncing machine translation as the end of the translation world as we know it, it may be time to take a step back and see what happens. The discussion shouldn’t stop, but perhaps it could become less polemic and instead convert into a deeper conversation on and reflection of what may or may not lie ahead.

While initially there is a lot of focus on the perceived threat (there are some who think that I, together with other over zealous MT developers, am responsible for some of this fear and FUD), I am hoping that the dialogue moves beyond this point. Some MT systems have indeed been used to push rates down unfairly, but as we all begin to better understand these early mishaps, this can and must change. As George W Bush misspoke when he tried to say "Fool me once, shame on you; Fool me twice, shame on me." (I hope you click on the Bush video, it is toooo funny). If it becomes clearer to everybody what it actually takes to “finish off” MT output to required target quality levels, this kind of abuse cannot continue. We need better quality assessment so that this gap can be more clearly defined.

All MT systems are not equal and to have a global post-editing pricing policy is guaranteed to create disenchantment. We all need to better understand where to use MT and where to avoid it. MT cannot easily if ever replace humans, on the same projects that were previously done through a careful human TEP process. If the quality expectations are high, it has to be MT and human. MT makes most sense where there is ongoing volume and information volatility. We also need to better understand how to quickly assess the output quality of different MT systems so that post-editors are compensated fairly. The best MT systems are yet to come and they will be better because they are the product of informed linguistic steering in addition to standard data and MT techniques. We have yet to see useful compensation systems develop for these linguists and this will probably be needed before some of the uneasiness dissipates, but the forces driving this expanding need are strong and hopefully we should realize and recognize the value of these key individuals at some point in the future. This is already true at Asia Online so I imagine it can be done elsewhere.

In terms of disintermediation, I think MT will be only part of the whole picture, as we see more people learn to use motivated communities to get work done. Adobe and others have learnt to use “the crowd” to get traditional localization work done using translation platforms like Lingotek and newcomer Smartling (which might also have obtained the biggest startup investment made by a VC in the translation industry.) Much of the coming change will also come from collaboration software platforms like Lingotek, Smartling and others yet to come, that change how translation projects get done in terms of process flow, and that have a different modus operandi from traditional localization tools born in the TEP world. Translators are required to spend too much time working with data in different formats and too little time on the actual act of translation. New collaboration platforms and real data interchange standards will hopefully enable translators to focus mostly on real linguistic problem solving, and not on managing archaic and arcane format interchange issues.

From my vantage point, I see that -
1) Translation is increasingly done outside of the sphere of the localization world and community based translation initiatives around the world are gathering momentum both in the non-profit and corporate world
2) The volume of translation that can help drive international business initiatives forward is increasing at a substantial rate (5X to perhaps as much as 100X) Interestingly, there are still some who think that this content explosion is a myth.
3) Social network conversations matter and are often more important to translate than having user documentation that is "perfect" and “error-free”. The company to customer communications have also become much more interactive, real-time and urgent and go way beyond the scope of most user documentation.

Thus to approach every translation task with the TEP mindset that made great sense in the 1X or 2X volume days is not useful today. New approaches are needed and new models of automation/collaboration are necessary - and are perhaps the only way that all the changing momentum can be handled effectively. MT is simply one part of the equation and is far from being the whole solution. The need to solve this overall translation challenge is linked to the customers business survival so it gains a kind of momentum of inevitability. Businesses need to translate way more content to remain competitive in global marketplaces that move at internet speeds, thus automation and better collaboration are essential and critical to success and even survival.

We have seen in the last 5 years that many of the largest global organizations have launched MT initiatives on their own, because their LSP vendors were/are stuck in the TEP mindset, and did not realize that their customers had to learn to do dramatically more translation with not very much more money. This is perhaps a clue that in certain volume and time constraint scenarios, MT is necessary. We have seen that global enterprises need to solve this problem with or without vendors who historically managed the bulk of their localization translation. My sense is that this trend is likely to build momentum if LSPs do not learn to offer real MT competence. Real MT competence comes from building custom systems and seeing what works and what does not. Global enterprises will increasingly take this task upon themselves if they cannot find LSPs who can help them solve this problem e.g. TAUS is mostly a buyer driven organization with the key focus of sharing TM and facilitating large scale MT initiatives. The greatest successes presented at TAUS are all in-house initiatives with little LSP involvement. Surely this is because there is a real need, and we see that competitors are willing to share linguistic data and resources to handle this problem. I suspect that the buyer’s motivation is less about saving cents per word on translation costs, and much more about keeping and building customer loyalty and satisfaction in a world with growing global online commerce and information access needs.

My guess is that some of the anxiety on the coming change comes not so much from raw technology like MT, but perhaps it's real origin is the growing awareness that some of the work they are involved with grows less valuable to the customer’s real mission: which is to build and develop international markets. Perhaps the anxiety is really rooted in the fact that they sense that they are not involved in high value work. The real threat is not MT per se, but it is the growing awareness amongst international marketing executives (in global enterprises) that they need to focus on what their customers really care about - more and more often this is something other than getting a really great user manual out. Have you noticed that many leading edge companies like Apple, Sony have dramatically reduced their investment in user manuals? The iPhone simply does not have one (in the box but they do on the web). I am not suggesting that manuals are going away, but it is already clear that their relative value is diminishing. The content that drives global customer adoption and loyalty is changing and thus the relative value of traditional localization (software and documentation) work also changes.

I expect that new translation production models to build success in international markets will involve MT (and other translation automation), crowdsourcing as well as professional oversight and management. It is very likely that old production models like TEP will be increasingly less important, or just one of several approaches to translation projects as new collaboration models gain momentum.I think that the most successful approaches to solving these "new" translation problems will involve a close and constructive collaboration between traditional localization professionals, linguists, MT developers, end-customers and probably others in global enterprise organizations who have never worked in "localization" but are more directly concerned about the quality of the relationship with the final customer across the world. At the end of the day our value as an industry is determined by how useful our input is to the process of building international markets and the requirements for success are changing as we speak.

The conversations at IMTT and the ensuing discussions suggest that while progress is being made in the understanding of translation technology, there is still a long way to go. I hope that at future IMTT conferences we see more discussion of approaches to translation projects where TEP may not make sense and automation and collaboration approaches can help solve different kinds of problems that also further international business initiatives. I expect that IMTT will be a leader in changing the current polemic and also expand the conversation to new stakeholders. This conversation is likely to require much more direct content with product management, international sales and support teams and the final end customer. Hopefully some of us in the industry get to lead or participate in the driving this change through these new conversations.

While change can be difficult it can also be a time of opportunity and a time when leadership changes. Very few try to understand the forces of change better. People often go through a sequential emotional cycle before they learn to cope, and eventually even thrive when facing disruptive change. Those who get stuck at fear and despair, often end up as victims.

This little video shows that effective and heartfelt communication across cultures need not be heavily planned, ponderous or calculated. Sometimes simple and real is enough to create the change and build a connection to your customers.

Where the Hell is Matt? (2008) from Matthew Harding on Vimeo.

TechGist05

The Continuing Saga & Evolution of Machine Translation

Comments

Post a Comment

Popular posts from this blog

Full Stack Development Roadmap For 2020

Understanding MT Quality: BLEU Scores

Machine Translation at Volkswagen AG