Post-edit distance (PED) measures the amount of editing that a machine-translated text requires in order to meet quality expectations. The primary difference in comparison to BLEU is that the human reference translation is actually done based on MT, which increases the probability that machine translation and human translation are similar or identical. This is because translators with a solid post-editing background will not introduce unnecessary changes to the MT. Therefore, assuming that the translators did their job correctly, PED reflects MT suitability for post-editing much better than BLEU does.
So, can any linguist with post-editing experience do the post-editing for a PED analysis? Not quite. The important factor here is that the translator actually understands the customer’s quality expectations for the text. A machine translation can sound fluent, without any apparent errors of the meaning, and still not meet quality requirements. For instance, customer-specific terminology or style might not have been applied, texts might exceed length limitations, or formatting information might have been lost. In short, you’ll want a linguist with both post-editing experience and customer know-how.
With PED, real-world conditions are required to obtain reliable figures, and post-edit distance can be calculated only based on post-editing that meets quality expectations. An algorithm calculates the difference between the raw MT and post-edited translation and issues a value per segment and per text sample. This value indicates the percentage of raw MT that was reused by the translator, starting from 100% (translator made no changes to the segment or text) and decreasing from there. High PED scores indicate a real gain in efficiency for the translator.
How do PED scores relate to post-editing effort?
The rule of thumb here is that the higher the PED score, the lower the effort. However, as with translation memory matches, there’s a certain percentage threshold that must be reached to represent real gains in efficiency. If the overall PED value for a given text type is consistently below this threshold, MT doesn’t save time.
So, does a high PED value mean that the translator had no effort, and do you have to pay for post-editing if PED is close to 100%? The answer is: If you want post-editing, it will have a cost. It is important to note that even with a very high post-edit distance value, the translator’s effort is not zero: They have performed a full review of the target text and compared it to the source text, validated that the terminology applied by the MT system is the right one, potentially performed additional research or obtained clarification, and so on. Therefore, the effort of post-editing is never zero, even when there are almost no edits. This is comparable to a second opinion by a physician: The fact that both doctors come to the same conclusion doesn’t mean the second one didn’t have to check the patient thoroughly.
Reliable post-editing effort predictions
By assessing PED values across large enough volumes of similar text, you can get a reliable indication of the effort involved and quantify efficiency gains. Small anecdotal samples are not a suitable basis for this kind of analysis, as they might result in PED figures which are too positive or negative and ultimately not representative of average real-world results. Thankfully, testing with suitable volumes does not mean adding cost to your normal translation process. We know our stuff on this one, so don’t hesitate to ask your contact at Amplexor for a Machine Translation Pilot and learn how to calculate your savings potential.