Two Different Perspectives On Post-Editing

Posted by

This is part II of a mini series about quality assurance in the localization industry. Read part I here and part III here (live on 23 Sep).

Back in the days when I was a machine translation specialist, it was part of my job to make sure that the machine translation output we used had a certain quality. I was positioned between the Sales and Production departments of the company, because that certain quality was important for both: As the content usually got post-edited, I had to check if the post-editors would actually be able to work with the output. And as machine translation and post-editing (MTPE) was a cheaper product than good old translation, the Sales guys wanted to know how much they could go down with our rates. The question that I usually got from both sides was: Is the output good enough? – to read about why this question is pretty hard to answer on the spot, check part I of this mini series.

Typical Errors In Machine-Translated Output

I am repeating myself, but that’s okay, because it is really that important: It always depends on the context. Machine-translated sentences will usually have some characteristics, which, depending on the context, could possibly slow down the process of post-editing – or not. Among those characteristics, you can usually find:

  • Inconsistently translated words or concepts. Neural machine translation is quite good in representing sentences structures, but text coherence is still a domain that it needs to conquer. It may appear that the topic of your original text gets translated in one way in the first sentences, and in another way in the fifth sentence – because the machine simply ‘forgot’ that it used a different word some sentences ago.
  • Strange sentence structures. Germans like to write very long and very complex sentences. That’s one of the perks if you speak a language with a very flexible syntax! English, however, sounds weird if the machine translation applies the same complicated structure as the German source sentence. Sentences might even lose their meaning if they are too long.
  • Changes of register or a wrong register. If your machine translation model was trained on data that mainly contained formal language, and you use it to translate a more informal text, it will still try to produce formal language – because that’s what it has learned.
  • Missing nuances. If your text lives of your very delicate formulations and even the slightest nuances are important to get the message across, machine translation might miss this, or, even worse, change the nuances a tiny little bit.

I didn’t list mistranslations or jibberish output – yeah, that’s still happening and most probably will keep happening for the next years. Completely unusable sentences will always slow down any post-editing process, so they do not really add to the discussion.

Please note that I am speaking about neural machine translation – statistical engines which are still used in some cases nowadays have other characteristics. If you want to learn about the difference, check out this blog post!

The Difference Between Errors And Inappropriate Choices

Not all of the issues listed above necessarily result in errors. The better neural machine translation becomes, the less real errors you will find in the output. However, the amount of inappropriate choices (for that specific context) might not go down at all! It is important to distinguish between those two concepts. While a typo will always be an error, missing a nuance might be inappropriate in one text, while it is completely fine in another. You might hear people speaking of errors in this context – these are usually the inappropriate choices, and not universally recognized errors. This doesn’t mean that one is worse than the other – both need to be corrected. The amount of corrections in the second category will differ strongly, depending on the context – and will be one of the main factors of how well a text can be post-edited.

The Difficulties On The Production Side

Now, if your text is a manual to assemble a machine, with usually short sentences, and you really don’t care whether it’s translated with formal or informal language – congratulations, most of the above mentioned characteristics will not slow down the post-editing too much. I have added ‘too much’ here because this is something you should keep in mind: If you ask a post-editor to go through the machine-translated text, they will apply some changes. Asking them to exclude whole categories might not make them faster, it might even slow them down! However, if it’s not in the scope of the project, the post-editors also know that they don’t need to spend extra time on issues like those.

The more the original text lives of its fluent and nice style, the more difficult it gets to use machine translation in a ways that makes sense. If you want to have some blog articles translated, one of their main features may be that they catch the reader’s attention and motivate them to read more. Machine translation will most probably not be able to do the same in the target language. You can still use a machine-translated base and ask someone to post-edit it – but be aware that depending on your expectations, they might feel the need to rewrite a whole lot of the base completely. In this case, the machine-translated output will not help them to get through the text faster. In some cases, a well-trained translator can even be faster with a translation from scratch than when using the machine translation output.

The Difficulties On The Sales Side

My colleagues in Sales had quite different problems, and those had to be taken into account as well when we put together a quote for a new client. Here’s what I learned about their pain points:

  • Explaining the difference between localization, internationalization and translation. If your goal is to translate a catalogue into a different language, it might not be sufficient to translate all the words (that’s the part that machine translation can do for you) – the customers you are targeting in the new country may not be familiar to measurements like Fahrenheit, meters or kilogram. Imagine that you are the king of screws and dowels and all your products are named after their measurements. The hex bolt 4/20 is 20 mm long and has a diameter of 4 mm. You might want to clarify that this is measured in millimeters – or you might even consider renaming your hex screws accordingly. That is something that machine translation will not do for you, and it is a tedious job for a translator, too. Tedious in Sales terms always means: more expensive.
  • Dealing with clients who don’t know what they need exactly. Someone might only have a very rough understanding of localization, but still will compare one quote to another. Ours might be higher than the quote of a competitor, simply because we have an extra step or perform more tasks during the post-editing phase. If the client contact doesn’t know what exactly post-editing is, they will not know that there is no universal definition of what needs to be performed during this step (even with ISO certification and according to industry standards, there is still a lot of wiggle room).
  • Pitching our evaluation against the client’s evaluation. If we evaluate whether something can be machine-translated, we use our broad knowledge and past experience. We have gone through the phase where we thought that you could estimate the doability of post-editing just from looking at a few sentences and trusting your gut feeling. Spoiler alert: You can’t. However, our client contact might not know that, they might look at the output for a few minutes and decide for themselves that there’s not much to be done. And it’s the task of our Sales guys to kindly let them know that they might be wrong…
  • Insisting on the necessity of post-editing while the client has read on the internet that neural machine translation is as good as real translators. Check out my post on the human quality paradox to understand why it’s perfectly correct that Google, Microsoft and other big players have claimed in the past that their machine translation output is as good as human translations – and why simultaneously, a post-editing still is necessary in most cases. This also relates to the concept explained above: There might not be real errors in the machine translation output, but still inappropriate choices.

As you can see, Sales’ problems are quite different from Production’s problems. And still, they all need to be resolved… the art of machine translation evaluation and quality assurance is to stay on that thin line between overly complicating it (for Sales) and brushing existing problems away (for Production). At the same time, a good evaluation and quality assurances measures should always be explained, so that both sides know what the reasoning behind a yes, no or maybe is. Within your company, it’s great if you can establish certain standards. They might be according to industry standards, but might also differ from them – depending on who your clients are and what they really need. Always applying universal quality measurements sounds nice and clean, but does not take into account how individual the quality of machine translation should be evaluated, because – you shouldn’t be surprised to hear it again – context is what matters most in every evaluation of machine translation quality, as well as in the post-editing process.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s