My 2017 trend is the impact of machine translation on content strategy.
What is machine translation?
The basic definition from Wikipedia is a good start. Machine translation uses “software to translate text […] from one language to another.” Behind this simple definition lies a world of hurt because languages are complex and don’t map neatly from one to another. An idiom like “once in a blue moon,” if translated literally, becomes meaningless, because other languages don’t use that expression. You need a translation machine that understands that “once in a blue moon” means “very rarely” and can then capture that meaning in other languages.
So machine translation of William Shakespeare or Willa Cather is just not an option. The richer and more complex the writing, the more challenging the translation.
A disruptive innovation?
Machine translation does have some winning qualities:
- Speed. Machine translation is nearly instantaneous. You can get a quick-and-dirty translation of a newspaper article and find out roughly what Norwegians think about Brexit. This creation of imperfect translations is called “gisting”–you get the rough gist of the text.
- Cost. Machine translation is much less expensive than high-quality translation by a professional translator. Online services, such as Google Translate, are free.
And thus, machine translation qualifies as a classic disruptive innovation.
Currently, machine translation is suitable for low-quality use. But you can augment machine translation with linguists editing the raw translation. This approach is called machine translation with post-editing, and it moves the results up into medium-quality use. For the right kind of content, machine translation with post-editing is also more efficient than a linguist working without machine translation.
Machine translation (MT) versus computer-assisted translation (CAT)
Computer-assisted translation (CAT) is different from machine translation (MT). CAT refers to a linguist who uses software to support translation work. Typically, a CAT workflow includes the following:
- A translation memory (TM) database, which stores previous translation work.
- A process in which TM content is matched against new assignments. Matching segments are pre-translated using the TM database.
- An interface that shows the linguist the source segment (usually sentence) and provides a place to insert the translated segment.
- Additional workflow and QA support features.
Any professional translation effort should use CAT and TM. The use of MT is much more controversial.
What does machine translation mean for content strategy?
In 2011, Don DePalma of Common Sense Advisory spelled out the implications of machine translation for localization. The report’s title summarizes the premise nicely: As Content Volume Explodes, Machine Translation Becomes an Inevitable Part of Global Content Strategy. His report is focused on the localization process and how machine translation must be integrated.
Setting up a localization workflow that includes machine translation is challenging. Among other things, you have to choose software, choose MT databases, work out a strategy for updating and configuring the databases, and figure out where to use linguists. Furthermore, this process may be different for different language pairs.
Your content strategy must account for localization and the possibility of machine translation. There’s nothing new here. Best practices include:
- Writing simple, short sentences.
- Jargon, idioms, and rhetorical flourishes are not your friend.
- Using consistent terminology.
- Structuring content consistently.
- Avoiding culture-specific references.
In marketing and advertising content, you may need to break all of these rules to produce effective content. The implication is, of course, that you shouldn’t apply MT to that content.
Localization strategy with MT
A reasonable localization strategy is going to be tiered, as shown in this example:
- Low quality: rough translation for understanding. Use machine translation only and include a disclaimer. Suitable for “gisting” for posts in user forums or knowledge base articles that are rarely read.
- Medium quality: reasonable translation with correct grammar and mechanics. Use MT with post-editing to accelerate the translation process. Suitable for critical knowledge base articles and reference documents.
- High quality: excellent translation that sounds as though it was written in the target language. Use professional translators. Suitable for conceptual product documents, legal documents, and other content where accuracy is critical.
- Most demanding use: The finished product in the target language must connect with readers. Instead of translating the original document, consider using transcreation (starting with the concept and re-creating the document in the target language). Suitable for advertising, marketing, and other persuasive content.
Some organizations may also divide their global markets into tiers. For example:
- Tier 1: High-priority markets. Full localization of all product content.
- Tier 2: Medium-priority markets. Some localization of product content.
- Tier 3: Low-priority markets. Minimal localization to meet regulatory requirements.
In a high-priority market, an organization might increase budget to move more content up the quality scale.
Best practices for content strategy and machine translation
We know that content volume and localization requirements are increasing. We need to look for ways to reduce friction in our content strategy. Where can we add automation? Where can we eliminate manual process? It’s much cheaper to invest in source content development than it is to clean up translations in multiple languages downstream.
So, once again, we come back to the basics:
- Ensure that the source content development process produces high-quality content.
- Look for ways to automate content deployment and delivery.
- Do a risk assessment to figure out appropriate uses for any technology, including machine translation.
Because of globalization and machine translation, our content strategy work must include source content management–terminology management, reuse, and automated formatting.
What do you think? Is machine translation going to affect your content strategy work? Is there another, more important trend?
Recommended additional reading:
- The Great AI Awakening, an epic-length discussion of neural networks in machine translation, New York Times, December 2016
- Hyperbolic? Experts Weigh In on Google Neural Translate, slator.com, September 2016