Five gotchas that will affect your translation turnaround time
Having worked at two translation companies and on many projects requiring localization, I appreciate just how nimble LSPs (language service providers) can be. Their ability to track down translators with the necessary subject matter expertise and handle a vast array of file formats is truly remarkable. That said, localization efficiency is dependent on you, the content provider.
Although a good LSP can work with any type of content you might throw at them, their efficiency (and therefore your costs) depend on the source files. From the simplest looking Word file to the most robust XML solution, what’s lurking beneath the surface of these files could make or break your translation deadlines.
1. Frankenfiles
A “Frankenfile” is a term of endearment used to describe a file with absolutely no rhyme or reason behind its formatting (often, it was quite literally hacked and stitched together). When a file is full of style overrides, inconsistent style use, unconventional or inconsistent spacing, hard returns for line wrapping, and other visual formatting “hacks,” it is incredibly difficult and time consuming to properly translate the content and deliver it with the same look and feel. As a Frankenfile is updated, new hacks (or fixes) are likely to be introduced that reduce translation memory leverage. (Leverage is a measurement of how much previous translation can be reused in a translation update. It is far cheaper to use translation memory than it is to retranslate content.)
The best advice, regardless of what tool you are using, is to strictly adhere to a template or standardized style. Do not use formatting overrides to achieve a specific visual result, and do not create stylistic exceptions for “unique” content unless absolutely necessary AND unless you are committing those styles to the core template used by all files. Consistency is extremely important, particularly if the LSP needs to create custom filters to handle your style conventions in their tools. They can then create the filters once and reuse them as needed, saving precious turnaround time.
2. CMS output
A content management system is a wonderful thing. It keeps your content neat and tidy in a centralized location, allows you to reuse portions as needed, and likely supports a plethora of workflow automation out of the box. One of these workflows may be translation, but it may not be the workflow that your LSP had in mind. Some CMS have built-in translation UIs, some provide XML output, and others even go so far to supply XLIFF output (often considered a translation-friendly format).
All of these options are valid for a translation workflow, but do not assume that your LSP is ready and able to work in the manner your CMS dictates. You may find that what sounded good on paper doesn’t in fact work very efficiently in practice. Before approaching a localization project, and ideally before selecting a CMS, talk with your LSP to see what workflows might be more optimal than others. Conduct a trial/pilot translation using your chosen workflow to ensure that content can not only be exported, translated, and imported, but can be done so several times as revisions are introduced into the source after each translation cycle.
3. Excessive wordsmithing
Your content will most likely need to be updated over time. Sometimes new information is added, outdated information is removed, and incorrect information is corrected. When you edit content that has previously been translated, every edit comes at a cost; rewritten text, changes in punctuation, and even changes in spacing require a translator’s eye, and will lessen your translation memory leverage.
A best practice within a translation workflow is to only modify what absolutely needs to be updated from release to release. If you missed an Oxford comma or an a/an distinction, stop and consider whether there is value in correcting it, as these seemingly innocent edits can add up over time. Peppering these changes through a document can add an hour or more (I’ve seen cases where an entire day or more was lost to vetting fuzzy matches in a translation memory) to turnaround time.
4. Graphics
Graphics are a tricky beast in the localization world. Two key issues with regard to graphics are 1) translatable text within the graphics, and 2) cultural appropriateness of the images themselves.
The argument against using text in graphics has pretty much been beaten to death, but I mention it now because—believe it or not—it’s still common practice. It could be out of convenience (for either the illustrator or the reader), or habit, or not knowing. If you choose to use text in your graphics, you should plan to keep the editable unflattened source image (the raw Photoshop, Illustrator, or other image application source file), all fonts used, and keep a text/Word/Excel file with it that contains all of the text used in the graphic. This way the translator has everything needed to produce the translation. This will take some time to do, depending on how intricate your images are, but will take much less time than trying to hack translated text into a flattened JPG.
Cultural appropriateness of imagery is something that is less often considered. Sometimes the most innocent of images can come across as offensive in other cultures. Little things like gestures, fine details, and colors can make the difference between a non-issue and a dead stop when preparing content for varied cultural markets. The time spent researching and proofing for these issues before sending the files for translation pales in comparison to the time and effort spent searching for a replacement at the 11th hour, particularly when the replacement influences design changes.
5. Translating deliverable files
Often, we use content either from a variety of sources or from a greater pool of information when creating documentation. When readying for translation, it’s common to send out for translation only the documents that are being delivered in English. This can complicate the translation process. Specifications used for products shipped to one region may not be compatible with the needs of another region. Wording or specific types of information might also not be appropriate in another region. Finally, any changes influenced by related products—which may or may not have been available in other regions prior—may also impact translation.
It is important to consider translation at the source level to avoid these risks and ensure that your information is ready to be deployed anywhere in any language at any time. This will decrease any rework from last minute issues found by translators, allow you to build consistency into your wording choices up front, and help you realize a greater return from translation memory leverage over time.
Do you have other suggestions for producing localization-friendly files? Please share them in the comments!
And one last note about those “Frankenfiles”… Beware!
Marion
You forgot a big one: leaving room in your layouts for text expansion. With the exception of the Asian languages, most languages need 20-35% more space than English does. In today’s age of cramming five pounds of everything into a one pound container, translation and localization are often seriously compromised both in form and function by the need to “make it fit”.
Bill Swallow
Thanks Marion. Yes, that certainly is a big issue. It mainly affects busy, compact visual layouts (many text boxes, etc.) and not so much flowing text manuals with automatic page flow. But yes, leaving enough space in confined layouts for text expansion (in textual files and graphics) is another big issue to look out for. Thanks for noting that!
Anthony Liu
Great article! One thing we’ve been doing as an LSP is to integrate our production system with our clients’ CMS, so that the localization process from project initiation to delivery can be automated, saving a huge amount of copy-and-pasting and email exchanges.
Another thing I can think of is to create and maintain a set of file templates for each locale. Different locales have different preferences for typefaces, paper sizes (US Letter for North America vs. A4 for rest of the world, for example), and page layout. Having the templates ready when launching a localization project can significantly reduce turnaround time.
Bill Swallow
Thanks Anthony. Integrating with the CMS is a great way of reducing the human error around file transfer, and can potentially avoid the kitchen sink XML or XLIFF transfers.
Maintaining templates for each locale is critical and something I’ve been advocating for the past decade or so. I’d taken it one step further and used a hybrid of the US Letter and A4 page sizes for PDF (US Letter height, A4 width). This way no matter where the document goes, the document can be printed out with the same content on the same page. And yes, all variables, cross reference formats, and page layout boilerplate content should be pre-translated for every target language so translated content can be easily flowed in to the templates.
Justin Qualler
I’m always conscious to avoid idioms and to stick to a smaller vocabulary, using some of the rules of Standard Technical English, for example. Also, I’ve heard it’s important not to eliminate articles from the writing, as some try to do to achieve brevity. Finally, I try to be consistent with phrases. If I have phrases duplicated throughout (e.g. log in instructions) I try to make sure they are exactly the same.
Rahul Malik
Wow….Nice Article…
machine translation can not take place of human translation.
Integrating with the CMS is a great way of reducing the human error around file transfer