Content management and localization – finding the right fit
A CMS can be a powerful addition to your content authoring and delivery workflow or your worst enemy in translation. Or both.
Although most content management systems do support localization, they do so in many different ways. None of these are inherently better than the other, but they can have some serious consequences when the scope of your localization needs differs from what the technology offers.
Some systems have built-in translation UIs, where you can log in and translate the content right there in the CMS. This works very well in cases where you crowdsource your translations for internal use. Your translators would merely log in, type in the translations as needed, and log out.
However, if you’re using your CMS to manage customer- or public-facing content, or if you want to leverage translation memory for future translation work, you may be in for a rude awakening. To leverage your translation memory, you would need to export or otherwise copy/paste your content into an external file for translation. This adds significant overhead (people needed to copy/paste the source out and translations back in), introduces substantial risk of human error, and ultimately will delay the overall effort.
Some systems provide a raw export of content in XML, which theoretically can be used anywhere. It’s XML, therefore universal, right? Well, not exactly. Chances are that your CMS uses a “special blend” of XML to manage your content that it alone understands. A raw export will indeed give you everything, and it will be up to your translators to figure out exactly what requires translation and what should not be touched.
Computer-assisted translation (CAT) tools that translators use can be configured to properly handle the markup (and not all CAT tools are created equal in this regard, either). This takes time to configure and requires a dedicated technician on the translation side to maintain these filters for you over time.
Many CMSes allow for what we’ll call “creative formatting”, storing hand-formatted content in CDATA sections in the XML. CDATA sections basically instruct XML parsers to ignore that section and allow the local formatting in those sections (usually in HTML) to prevail. The tagging within those sections will be visible to the translators, requiring them to either hand-code the markup into the translations or spend considerable time and effort filtering the files to handle the local formatting. Either way, this adds a considerable amount of time to the translation effort. Your best bet is to consult with your translation vendor ahead of time to determine the best course of action.
Some content management systems do try to fully support a translation workflow by offering an export in XLIFF format, because that’s what many translation tools use for translation. The problem with this solution is that XLIFF is not a hand-off format for translation, but an internal string management format for CAT tools. It stores language pairs (source and target) for every string needing translation, and is usually highly extended/customized by the tools that employ it.
What this means is that the translators will need to spend time hacking the XLIFF to remove the target language portions every time they receive a XLIFF file in order to get at just the source content. They then need to marry up their final translations to the source in the original XLIFF file so your CMS can import it.
Now, this post is not all doom and gloom. In each of the scenarios mentioned, there are ways to best handle the translation process. Just as not all CMSes are created equal, the same holds true for localization tools and workflows. In order to circumvent many of these pitfalls, involve your CMS vendor and your localization vendor at the beginning or as soon as possible. You should iron out your localization workflow with both parties, ensuring that the CMS meets your localization needs and that your localization vendor can efficiently handle what the CMS requires.
Choose wisely, and consider your localization workflow up-front. Your choices will affect costs, quality, and time to deliver your localized content. We are available to help you find the best fit for your content and avoid unnecessary pitfalls.
Thanks, Bill. There’s much wisdom here. With translation, it can be hard even to know what questions to ask. Often, each side of the equation — writers and information architects on one side, translators on the other — doesn’t understand the other side’s tools and processes. Do you know of any resources that can break down the communication barriers and help increase the effectiveness of translation planning?
Our CMS has a built-in interface for translation and an integrated translation memory with terminology management. Our in-house and outside translators can log into the system, use workflow tasks to locate their assignments, run reports to show obsolete and missing translations, and translate just the segments in the identified topics that require their attention. There is no export/import of XLIFF files required, though our system does offer that function, if desired. Our output is for our external customers and posted to our website, not just for internal use, so there is hope out there, if your content will work with this type of workflow.
What CMS do you use?
One really helpful metadata tag on a CMS is a “has this item been translated yet” tag. And like all metadata, that should then expand to show who translated it, and when.