Linguistic Quality Assurance in Localization – An Overview
This is a post by Vassilis Korkas on the quality assurance and quality checking processes being used in the professional translation industry. (I still find it really hard to say localization, since that term is really ambiguous to me, as I spent many years trying to figure out how to deliver properly localized sound through digital audio platforms. To me, localized sound = cellos from the right and violins from the left of the sound stage. I have a strong preference for instruments to stay in place on the sound stage for the duration of the piece. )
As the volumes of translated content increase, the need for automated production lines also grows. The industry is still laden with products that don't play well with each other, and buyers should insist that vendors of the various tools that they use enable and allow easy transport and downstream processing of any translation related content. Froom my perspective automation in the industry is also very limited, and there is a huge need for human project management because tools and processes don't connect well. Hopefully, we start to see this scenario change. I also hope that the database engines for these new processes are much smarter about NLP and much more ready to integrate machine learning elements as this too will allow the development of much more powerful, automated, and self correcting tools.
As an aside, I thought this chart was very interesting, (assuming it is actually based on some real research), and shows why it is much more worthwhile to blog than to share content on LinkedIn, Facebook or Twitter. However, the quality of the content does indeed matter and other sources say that high quality content has an even longer life than shown here.
Finally, CNBC had this little clip describing employment growth in the translation sector where they state: "The number of people employed in the translation and interpretation industry has doubled in the past seven years." Interestingly, this is exactly the period where we have seen the use of MT also dramatically increase. Apparently, they conclude that technology has also helped to drive this growth.
The emphasis in the post below is mine.
In pretty much any industry these days, the notion of quality is one that seems to crop up all the time. Sometimes it feels like it’s used merely as a buzzword, but more often than not quality is a real concern, both for the seller of a product or service and the consumer or customer. In the same way, quality appears to be omnipresent in the language services industry as well. Obviously, when it comes to translation and localization, the subject of quality has rather unique characteristics compared to other services, however, ultimately it is the expected goal in any project.
In this article, we will review what the established practices are for monitoring and achieving linguistic quality in translation and localization, examine what the challenges are for linguistic quality assurance (LQA) and also attempt to make some predictions for the future of LQA in the localization industry.
Despite the fact that industry standards have been around for quite some time, in practice, terms such as ‘quality assessment’ and ‘quality assurance’, and sometimes even ‘quality evaluation’, are often used interchangeably. This may be due to a misunderstanding of what each process involves but, whatever the reason, this practice leads to confusion and could create misleading expectations. So, let us take this opportunity to clarify:
Given the volume of translated words in most localization projects these days, it is practically prohibitive in terms of time and cost to have in place a comprehensive QA process, which would safeguard certain expectations of quality both during and after translation. Therefore it is very common that QA, much like TQA, is reserved for the post-translation stage. A human reviewer, with or without the help of technology, will be brought in when the translation is done and will be asked to review/revise the final product. The obvious drawback of this process is that significant time and effort could be saved if somehow revision could occur in parallel with the translation, perhaps by involving the translator herself with the process of tracking errors and making these corrections along the way.
The fact that QA only seems to take place ‘after the fact’ is not the only problem, however. Volumes are another challenge – too many words to revise, too little time and too expensive to do it. To address this challenge, Language Service Providers (LSPs) use sampling (the partial revision of an agreed small portion of the translation) and spot-checking (the partial revision of random excerpts of the translation). In both cases, the proportion of the translation that is checked is about 10% of the total volume of translated text, and that is generally considered agreeable to be able to say whether the whole translation is good or not. This is an established and accepted industry practice that was created out of necessity. However, one doesn’t need to have a degree in statistics to appreciate that this small sample, whether defined or random, is hardly big enough to reflect the quality of the overall project.
Today we could classify QA technologies into three broad groups:
Terminology and glossary/wordlist compliance, empty target segments, untranslated target segments, segment length, segment-level inconsistency, different or missing punctuation, different or missing tags/placeholders/symbols, different or missing numeric or alphanumeric structures – these are the most common checks that one can find in a QA tool. On the surface at least, this looks like a very diverse range that should cover the needs of most users. All these are effectively consistency checks. If a certain element is present in the source segment, then it should also exist in the target segment. It is easy to see why this kind of “pattern matching” can be easily automated and translators/reviewers certainly appreciate a tool that can do this for them a lot more quickly and accurately than they can.
Despite the obvious benefits of these checks, the methodology on which they run has significant drawbacks. Consistency checks are effectively locale-independent and that creates false positives (the tool detects an error when there is none), also known as “noise”, and false negatives (the tool doesn’t detect an error when there is one). Noise is one of the biggest shortcomings of QA tools currently available and that is because of the lack of locale specificity in the checks provided. It is in fact rather ironic that the benchmark for QA in localization doesn’t involve locale-specific checks. To be fair, in some cases users are allowed to configure the tool in greater depth and define such focused checks on their own (either through existing options in the tools or with regular expressions).
In practice, for the majority of large scale localization projects, only post-translation LQA takes place, mainly due to time pressure and associated costs – an issue we touched on earlier in connection with the practice of sampling. The larger implication of this reality is that:
The continuously growing demand in the localization industry for the management of increasing volumes of multilingual content in pressing timelines and the compliance with quality guidelines means that the challenges described above will have to be addressed soon. As the trends of online technologies in translation and localization become stronger, there is an implicit understanding that existing workflows will have to be uncomplicated in order to accommodate future needs in the industry. This can indeed be achieved with the adoption of bolder QA strategies and more extensive automation. The need in the industry for a more efficient and effective QA process is here now and it is pressing. Is there a new workflow model which can produce tangible benefits both in terms of time and resources? I believe there is, but it will take some faith and boldness to apply it.
In the last few years, the translation technology market has been marked by substantial shifts in the market shares occupied by offline and online CAT tools respectively, with the online tools gaining rapidly more ground. This trend is unlikely to change. At the same time, the age-old problems of connectivity and compatibility between different platforms will have to be addressed one way or another. For example, slowly transitioning to an online CAT tool and still using the same offline QA tool from your old workflow is inefficient as it is irrational, especially in the long run.
A deeper integration between CAT and QA tools also has other benefits. The QA process can move up a step in the translation process. Why have QA only in post-translation when you can also have it in-translation? (And it goes without saying that pre-translation QA is also vital, but it would apply to the source content only so it’s a different topic altogether.) This shift is indeed possible by using API-enabled applications – which are in fact already standard practice for the majority of online CAT tools. There was a time when each CAT tool had its own proprietary file formats (as they still do), and then the TMX and TBX standards were introduced and the industry changed forever, as it became possible for different CAT tools to “communicate” with each other. The same will happen again, only this time APIs will be the agent of change.
Looking further ahead, there are also some other exciting ideas which could bring about truly innovative changes to the quality assurance process. The first one is the idea of automated corrections. Much in the same way that a text can be pre-translated in a CAT tool when a translation memory or a machine translation system is available, in a QA tool which has been pre-configured with granular settings it would be possible to “pre-correct” certain errors in the translation before a human reviewer even starts working on the text. With a deeper integration scenario in a CAT tool, an error could be corrected in a live QA environment the moment a translator makes that error.
This kind of advanced automation in LQA could be taken even a step further if we consider the principles of machine learning. Access to big data in the form of bilingual corpora which have been checked and confirmed by human reviewers makes the potential of this approach even more likely. Imagine a QA tool that collects all the corrections a reviewer has made and all the false positives the reviewer has ignored and then it processes all that information and learns from it. Every new text processed and the machine learning algorithms make the tool more accurate in what it should and should not consider to be an error. The possibilities are endless.
Despite the various shortcomings of current practices in LQA, the potential is there to streamline and improve processes and workflows alike, so much so that quality assurance will not be seen as a “burden” anymore, but rather as an inextricable component of localization, both in theory and in practice. It is up to us to embrace the change and move forward.
Reference
Drugan, J. (2013) Quality in Professional Translation: Assessment and Improvement. London: Bloomsbury.
Vassilis Korkas is the COO and a co-founder of lexiQA. Following a 15-year academic career in the UK, in 2015 he decided to channel his expertise in translation technologies, technical translation and reviewing into a new tech company. In lexiQA he is now involved with content development, product management, and business operations.
Note
This is the abridged version of a four-part article series published by the author on lexiQA’s blog: Part 1 – Part 2 – Part 3 – Part 4
This link will also provide specific details on the lexiQA product capabilities.
As the volumes of translated content increase, the need for automated production lines also grows. The industry is still laden with products that don't play well with each other, and buyers should insist that vendors of the various tools that they use enable and allow easy transport and downstream processing of any translation related content. Froom my perspective automation in the industry is also very limited, and there is a huge need for human project management because tools and processes don't connect well. Hopefully, we start to see this scenario change. I also hope that the database engines for these new processes are much smarter about NLP and much more ready to integrate machine learning elements as this too will allow the development of much more powerful, automated, and self correcting tools.
As an aside, I thought this chart was very interesting, (assuming it is actually based on some real research), and shows why it is much more worthwhile to blog than to share content on LinkedIn, Facebook or Twitter. However, the quality of the content does indeed matter and other sources say that high quality content has an even longer life than shown here.
Source: @com_unit_inside |
Finally, CNBC had this little clip describing employment growth in the translation sector where they state: "The number of people employed in the translation and interpretation industry has doubled in the past seven years." Interestingly, this is exactly the period where we have seen the use of MT also dramatically increase. Apparently, they conclude that technology has also helped to drive this growth.
The emphasis in the post below is mine.
==========
In pretty much any industry these days, the notion of quality is one that seems to crop up all the time. Sometimes it feels like it’s used merely as a buzzword, but more often than not quality is a real concern, both for the seller of a product or service and the consumer or customer. In the same way, quality appears to be omnipresent in the language services industry as well. Obviously, when it comes to translation and localization, the subject of quality has rather unique characteristics compared to other services, however, ultimately it is the expected goal in any project.
In this article, we will review what the established practices are for monitoring and achieving linguistic quality in translation and localization, examine what the challenges are for linguistic quality assurance (LQA) and also attempt to make some predictions for the future of LQA in the localization industry.
Quality assessment and quality assurance: same book, different pages
Despite the fact that industry standards have been around for quite some time, in practice, terms such as ‘quality assessment’ and ‘quality assurance’, and sometimes even ‘quality evaluation’, are often used interchangeably. This may be due to a misunderstanding of what each process involves but, whatever the reason, this practice leads to confusion and could create misleading expectations. So, let us take this opportunity to clarify:
- [Translation] Quality Assessment (TQA) is the process of evaluating the overall quality of a completed translation by using a model with pre-determined values which can be assigned to a number of parameters used for scoring purposes. Such models are the LISA, the MQM, the DQF, etc.
- Quality Assurance “[QA] refers to systems put in place to pre-empt and avoid errors or quality problems at any stage of a translation job”. (Drugan, 2013: 76)
Given the volume of translated words in most localization projects these days, it is practically prohibitive in terms of time and cost to have in place a comprehensive QA process, which would safeguard certain expectations of quality both during and after translation. Therefore it is very common that QA, much like TQA, is reserved for the post-translation stage. A human reviewer, with or without the help of technology, will be brought in when the translation is done and will be asked to review/revise the final product. The obvious drawback of this process is that significant time and effort could be saved if somehow revision could occur in parallel with the translation, perhaps by involving the translator herself with the process of tracking errors and making these corrections along the way.
The fact that QA only seems to take place ‘after the fact’ is not the only problem, however. Volumes are another challenge – too many words to revise, too little time and too expensive to do it. To address this challenge, Language Service Providers (LSPs) use sampling (the partial revision of an agreed small portion of the translation) and spot-checking (the partial revision of random excerpts of the translation). In both cases, the proportion of the translation that is checked is about 10% of the total volume of translated text, and that is generally considered agreeable to be able to say whether the whole translation is good or not. This is an established and accepted industry practice that was created out of necessity. However, one doesn’t need to have a degree in statistics to appreciate that this small sample, whether defined or random, is hardly big enough to reflect the quality of the overall project.
The progressive increase of the volumes of text translated every year (also reflected in the growth of the total value of the language service industry, as seen below) and the increasing demands for faster turnaround times makes it even harder for QA-focused technology to catch up. The need for automation is greater than ever before.
Source: Common Sense Advisory (2017) |
Today we could classify QA technologies into three broad groups:
- built-in QA functionality in CAT tools (offline and online),
- stand-alone QA tools (offline),
- custom QA tools developed by LSPs and translation buyers (mainly offline).
Consistency is king – but is it enough?
Terminology and glossary/wordlist compliance, empty target segments, untranslated target segments, segment length, segment-level inconsistency, different or missing punctuation, different or missing tags/placeholders/symbols, different or missing numeric or alphanumeric structures – these are the most common checks that one can find in a QA tool. On the surface at least, this looks like a very diverse range that should cover the needs of most users. All these are effectively consistency checks. If a certain element is present in the source segment, then it should also exist in the target segment. It is easy to see why this kind of “pattern matching” can be easily automated and translators/reviewers certainly appreciate a tool that can do this for them a lot more quickly and accurately than they can.
Despite the obvious benefits of these checks, the methodology on which they run has significant drawbacks. Consistency checks are effectively locale-independent and that creates false positives (the tool detects an error when there is none), also known as “noise”, and false negatives (the tool doesn’t detect an error when there is one). Noise is one of the biggest shortcomings of QA tools currently available and that is because of the lack of locale specificity in the checks provided. It is in fact rather ironic that the benchmark for QA in localization doesn’t involve locale-specific checks. To be fair, in some cases users are allowed to configure the tool in greater depth and define such focused checks on their own (either through existing options in the tools or with regular expressions).
Source: XKCD |
But, this makes the process more intensive for the user and it comes as no surprise that the majority of users of QA tools never bother to do that. Instead, they perform their QA duties relying on the sub-optimal consistency checks which are available by default.
Linguistic quality assurance is (not) a holistic approach
In practice, for the majority of large scale localization projects, only post-translation LQA takes place, mainly due to time pressure and associated costs – an issue we touched on earlier in connection with the practice of sampling. The larger implication of this reality is that:
- a) effectively we should be talking about quality control rather than quality assurance, as everything takes place after the fact; and
- b) quality assurance becomes a second-class citizen in the world of localization. This contradicts everything we see and hear about the importance of quality in the industry, where both buyers and providers of language services prioritise quality as a prime directive.
The continuously growing demand in the localization industry for the management of increasing volumes of multilingual content in pressing timelines and the compliance with quality guidelines means that the challenges described above will have to be addressed soon. As the trends of online technologies in translation and localization become stronger, there is an implicit understanding that existing workflows will have to be uncomplicated in order to accommodate future needs in the industry. This can indeed be achieved with the adoption of bolder QA strategies and more extensive automation. The need in the industry for a more efficient and effective QA process is here now and it is pressing. Is there a new workflow model which can produce tangible benefits both in terms of time and resources? I believe there is, but it will take some faith and boldness to apply it.
Get ahead of the curve
In the last few years, the translation technology market has been marked by substantial shifts in the market shares occupied by offline and online CAT tools respectively, with the online tools gaining rapidly more ground. This trend is unlikely to change. At the same time, the age-old problems of connectivity and compatibility between different platforms will have to be addressed one way or another. For example, slowly transitioning to an online CAT tool and still using the same offline QA tool from your old workflow is inefficient as it is irrational, especially in the long run.
A deeper integration between CAT and QA tools also has other benefits. The QA process can move up a step in the translation process. Why have QA only in post-translation when you can also have it in-translation? (And it goes without saying that pre-translation QA is also vital, but it would apply to the source content only so it’s a different topic altogether.) This shift is indeed possible by using API-enabled applications – which are in fact already standard practice for the majority of online CAT tools. There was a time when each CAT tool had its own proprietary file formats (as they still do), and then the TMX and TBX standards were introduced and the industry changed forever, as it became possible for different CAT tools to “communicate” with each other. The same will happen again, only this time APIs will be the agent of change.
Source: API Academy |
This kind of advanced automation in LQA could be taken even a step further if we consider the principles of machine learning. Access to big data in the form of bilingual corpora which have been checked and confirmed by human reviewers makes the potential of this approach even more likely. Imagine a QA tool that collects all the corrections a reviewer has made and all the false positives the reviewer has ignored and then it processes all that information and learns from it. Every new text processed and the machine learning algorithms make the tool more accurate in what it should and should not consider to be an error. The possibilities are endless.
Despite the various shortcomings of current practices in LQA, the potential is there to streamline and improve processes and workflows alike, so much so that quality assurance will not be seen as a “burden” anymore, but rather as an inextricable component of localization, both in theory and in practice. It is up to us to embrace the change and move forward.
Reference
Drugan, J. (2013) Quality in Professional Translation: Assessment and Improvement. London: Bloomsbury.
---------------
Note
This is the abridged version of a four-part article series published by the author on lexiQA’s blog: Part 1 – Part 2 – Part 3 – Part 4
This link will also provide specific details on the lexiQA product capabilities.
Comments
Post a Comment