Scriptorium Publishing

content strategy consulting

Localization best practices (premium)

December 15, 2014 by

Localization—the process of adapting content to a specific locale—is a critical requirement for global companies. It’s often treated as a necessary evil, but this is shortsighted. The quality of localization efforts affects the company’s bottom line.

More than ever, products and services are sought, purchased, and consumed in multiple language markets. Proper localization practices are critical to drive sales, and they can save you time and money in production.

This article describes best practices for efficient, effective localization.

Your content is a business asset. It promotes your company and its offerings, drives sales, and supports customers. To ensure excellent localization, you must consider your global audiences and all aspects of content production—content development and management tools, content creation practices, and your content partners.

Use your tools wisely

Your content authoring tools and how you use them affect the efficiency and cost of the localization workflow. When choosing an authoring tool, evaluate how you will use it in localization. Every feature you use in your content will have a localization impact.

floating bird's nest

A very resourceful nest – Pixabay: Hans

Reuse carefully

Content reuse helps reduce the number of words requiring translation, but be careful. Every block of reusable content needs to be complete. Paragraph level reuse—including notes, cautions, and warnings—is ideal.

Resist the temptation to maximize reuse with minuscule chunks (sentences, phrases, or even characters). These tiny chunks do not offer the translator enough context to understand and properly translate the text. (Think of a chunk that says “green.” Should this be translated as an adjective? What noun does the adjective modify? This matters in many languages. Is it perhaps a verb or even a noun–slang for money? The translator cannot necessarily tell from the isolated chunk.)

With larger blocks of text (entire topics or groups of topics), be mindful of the amount of conditional text needed to accommodate reuse.

Conditional approval

Conditional text helps you reuse content that is similar but not quite identical in multiple locations. Use conditional text sparingly.

Heavy use of conditional text is a sign that you should reconsider your strategy. This will save your sanity in authoring the content and avoid localization problems. Conditionalize only sentences and not phrases, terms, or single words. A small tweak in English wording often leads to a major change in other languages. Your word count will be higher if you conditionalize at the sentence level (rather than inside sentences), but the translation process will be smoother.


Regardless of your choice of authoring tool, use well-designed templates with explicit paragraph and character styles. (In Word, a document that uses only the Normal style is a very, very bad sign.) A cluttered, unmanaged document will require significant reformatting after translation. Reformatting lengthens delivery time and increases overall translation effort.

Consistent template use reduces post-translation reformatting effort, as style names are retained in the translated version. You should also create copies of your templates in all target languages with appropriate customizations (for example, replacing “Chapter” in English with “Chapitre” in French). When you apply the templates to translated content, formatting changes automatically.

XML and the evils of CDATA sections

XML separates formatting from content, which can alleviate translation formatting issues. A consistent structure is imposed by XML. Styles are applied at the time of publishing and not during the authoring process. XML can also contain translation instructions, such as identifying text that should not be translated. However, the type of XML you use and how it’s implemented in your authoring tools matters.

In particular, beware of character data (CDATA) sections. Many web content management systems use CDATA within the XML. CDATA allows the author to embed HTML code inside the content. This breaks the separation of formatting and content introduced by XML, eliminates XML’s ability to enforce structure, and allows the author to inject whatever code they want. Many translation tools have trouble working with CDATA sections. The translators either need to type the markup in their translations—reducing translation memory leverage—or develop custom filters to hide this markup, which will increase your translation cost and require ongoing maintenance.

Before implementing any XML-based authoring tool, understand how it fits within a localization workflow and what the XML looks like when exported for translation. Do not use CDATA sections.

Write for translation

During translation, a translation memory file is created. This file contains all of the text strings (usually stored as complete sentences) and their translations for each target language. When you send updated content for translation, that content is compared against the translation memory to determine what has changed and how much of the existing translations can be reused.

Many translation best practices are simply good writing rules:

dove on a city railing

City dwelling birds are modifying their songs to compete with city noise – Pixabay: LollemyArtPhotography

  • Follow the style guide. Be consistent and follow your writing conventions. Small variations in the text may seem harmless (for example, “click OK” versus “click the OK button”), but they require different translations. Using a consistent writing style will increase your translation reuse, saving you time and money.
  • Avoid rewriting existing content. Avoid editing content that has already been translated unless absolutely necessary. Every change you make decreases translation reuse–and increases cost. Just adding or removing a comma reduces your translation memory leverage.

One of the best ways to avoid these and similar issues is to document your writing guidelines. Style guides help with phrasing and other writing conventions. But there are other forms of content documentation that will help you improve your content and translations over time:

  • A glossary can help you use the correct term or phrase in the correct context. With a bit of modification or enhancement, you can convert the glossary to a full terminology set for translation. It should include all of the correct terms to use (and avoid), along with their definitions and examples of use. Translate your terminology into all of your target languages to ensure that the same translation is used by all translators.
  • If you are tailoring content for a specific cultural impact, capture all custom phrases (idioms, colloquialisms, and so forth) in a document along with their meaning and intended use. Your translators may be fluent in the target languages and cultural nuances, but they may not understand your intent. This document clarifies what you are trying to communicate, and facilitates development of custom messages with the same impact in the target languages.

Share these and other supporting documents with everyone involved with your content. You can use basic spreadsheets or sophisticated applications, but in either case, you will have better control over your content and how it engages your target audiences.

It’s all about images

The adage “a picture is worth a thousand words” describes the efficiency of graphic communication, but take this expression as a warning: Every graphic you translate can cost about the same as 1,000 words. Seems high? Consider the work involved in producing these graphics, and the impact they have on the audience and your company’s image if they are misunderstood. As with written content, graphics need to be handled with care and used strategically in order to be effective.

Screen shots

Screen shots are common in software documentation and often contain the text from the user interface. Here, you face a dilemma:

  • If the interface is not translated, the translated text will be interrupted by a graphic in a language the audience may not understand. This can detract from the usefulness of your content and colors the reader’s perception of your company.
  • If the interface is translated, you must recapture the screen shots. This can be a significant effort.

If you use localized screen shots, here are some best practices:

  • Use a specific project and script for capturing all of your screens. This script should be appropriate for all of your intended audiences, and be followed in every language implementation.
  • Only include a screen shot in your content when it is absolutely necessary for communicating a concept or action, and share these graphics with your translators for their review.
  • Never use a screen shot to take up space or add visual appeal to content unless you are willing to invest the time and capital to reproduce (with modification, when necessary) this look and feel in all localized versions.

Other graphics

sign "do not feed the birds" with flying bird in a null symbol

No flying? – Flickr: Roland Tanglao

Other graphic assets come with their own localization considerations. If a graphic is paired with text (for example, diagrams or pictures with call-outs, or graphics with text flowing around them), consider how these elements should be positioned. Is it more culturally appropriate or expected that the text appear to the left or the right of the graphic? Should text be used within the graphic or not? Should the same graphic be used for every audience?

There are many other considerations when using graphics, such as:

  • Color use—red could have a positive or negative connotation depending on the audience
  • Depictions of people, including their physical appearance, situational context, and gestures
  • Symbols—a mailbox looks very different from country to country, and a flag does not correspond to a language or a specific culture

Before committing to any one graphic design, make sure it is culturally appropriate for all audiences and adapt the design to best fit what your audiences expect.

(For more details on images, check out three cost-saving tips for localizing images.)

Choose a partner, not a vendor

Your localization vendors and translators can help you make important content decisions, but some are better than others. Engage them as you would any prospective business partner. After all, you are entrusting them with one of your most important business assets.

geese preening each other

Cooperation has its benefits – Flickr: Susanne Nilsson

It may seem appealing to partner with the vendors who offer the lowest translation rates, or ones who promise the quickest turnaround times. Before making a commitment to using them, make sure that they understand your industry, the terminology and expectations of your target markets, and the types of content you plan to distribute.

Many people can translate from one language into another, but it takes special skills and knowledge to appropriately translate complex, targeted content into a specific language for people working within a specific cultural environment (which may not be their native country). Cost of translation is an important factor to consider, but a more critical factor with a greater cost is the correctness and appropriateness of the translation.

Just as you want to engage with vendors who understand your business and your audience, you want to ensure that they also understand your content infrastructure. Can they correctly and efficiently work with your source files? Does their internal workflow integrate well with your own? Are they able to work in tandem with you to develop meaningful, effective content, or do they prefer to receive files once the source development is complete? And most importantly, will they help you to identify problems early on and help make improvements, or will they simply translate your content for you?

Localizing content is something that needs to be built into your content strategy up front. It is not a tangential concern or an end-of-process consideration. If you are producing content that needs to be consumed in multiple languages or in different cultures, every aspect of content development should consider that requirement.


If you need help with a global content strategy, contact us today.

Content strategy for tech comm and beyond

December 9, 2014 by

Having trouble with your technical content process? Need a strategy that can help you improve and scale? Before you make a change, talk to the other content-producing groups in your company—marketing, training, sales, support—to develop a content strategy that works across the entire organization.

Work together to create a better strategy. flickr: CiRC

Work together to create a better strategy.
flickr: CiRC

The purpose of content strategy is to support your business goals. A good content strategy will save your company money by increasing efficiency in your department. A better content strategy will increase efficiency in all departments, as well as unify your organization’s content to provide a positive customer experience and strengthen your brand.

Silos form when different departments create content in a vacuum without collaborating with each other. This is a common problem at many organizations and can lead to content issues, including:

  • Two or more groups creating slightly different versions of the same content
  • A lack of branding or design consistency across company content
  • Varying levels of content quality from one group to another

Bad or inconsistent content can hurt your business. Customers drive your business’ bottom line, so focusing on the way they consume your content can help point you in the right direction as you develop a content strategy. Some ways that you can do this include:

  • Remembering that customers don’t know or care about silos—when they’re using your content, they just want answers
  • Testing your company’s content (technical, marketing, and more) as if you are a customer
  • Using other companies’ content from a customer point of view as examples for how you can improve
  • Going above and beyond delivering content that customers need and delivering content that customers want

By working together with the content-producing groups outside of tech comm, you can create a global, company-wide content strategy that improves your customer experience and helps your business grow.

For a more in-depth look at extending your content strategy beyond tech comm, check out the recording of my webcast for the Content Wrangler Virtual Summit.

Content strategy and relocation: the trauma is the same

December 1, 2014 by

We moved into a new office at the end of October. The new space is bigger and nicer than the old space, but nonetheless, the moving process was painful. As a child, I moved several times and changed schools every two or three years. I then landed in North Carolina for college and stayed put. It occurs to me that a new content strategy introduces much of the same pain as relocation.

Motive matters

Pickford's moving van

Content strategy and relocation: Not very much fun // flickr: markhillary

When you relocate or change your approach to content, the reason matters. Did you choose to move for an amazing job opportunity or spiffy new features? Were you forced to abandon your old content creation system by factors beyond your control? Did you seize the opportunity to change things? Were you involved in the decision, or was it imposed by others? Did you carefully select your new residence, or did you have to move to an undesirable location because of factors beyond your control? Did you plan your move carefully, or did you have to move on short notice? Do you consider yourself a starry-eyed immigrant to a new system or a refugee who would like someday to return to your true home?

Your opinion will be affected by your motive.

Learning a new culture

Moving, especially across national boundaries, causes culture shock. You expect big changes, such as different languages, customs, and food. But culture shock is usually caused by small things–the complete unavailability of a specific favorite food, the slight differences in how traffic lights work, the presence of near-ubiquitous connectivity (points to US), or the presence of useful public transit (all the points to Europe).

On the content side, we find similar culture shock. Typically, it falls in these categories:

  • Easy things that are hard or impossible in the new world.
  • New features that go unused because they were hard or impossible in the old world (so avoiding them is ingrained behavior).
  • Difficulty understanding the basic premise of how things work. For example, spending lots of time tracking content status in a spreadsheet instead of letting the shiny new content management system do the reporting for you.
  • Content development problems; for example, a shift from writing exclusively for print to writing for print and online media. This is a tough transition.
Learning a new culture is hard work, and it takes time. Training and education help, but exploring something in a controlled setting is quite different from living it. People need time to live into their new content systems.

Learning what’s really important

To make a successful transition, you need to understand what’s really important. Delivering great content is more important than getting to use Your Favorite Familiar Tool. Great writing skills will transcend the environment.

Before the office move, we had certain expectations for how the space would be used. In particular, we have both a conference room and an open meeting area. We expected to conduct most meetings in the conference room. But instead, the open meeting area is getting all the usage. As a result, we are rethinking our furniture in that space. Is it bad that we have a huge conference room that’s barely getting used?

When you change content processes, people will surprise you with creative solutions that are not part of the plan. It’s quite likely that some of their ideas will be better than what you had in mind, so figure out what matters (productive meetings) and what doesn’t matter (the location).


When we moved, we allowed ourselves at least six months to feel comfortable in the new location. For content strategy changes, expect a similar transition period.

Content strategy: first steps (premium)

November 24, 2014 by

Content: You’re doing it wrong. That’s easy for me to say—we rarely hear from people who are happy with their content. But are you ready for a major transformation effort? Our approach is to assess the overall content lifecycle, meet with all the stakeholders, identify needs, develop a strategy, and then execute the strategy. If you want a more incremental approach, consider these inexpensive first steps.

Delivery formats

Newborn fawn

Baby steps // flickr: slopjop

Are you delivering content in the format that your readers want and need? Are there delivery mechanisms that would better meet their needs? Can you implement new formats in your existing toolchain?

Many of our clients are delivering content in PDF only. A move toward HTML and especially mobile-friendly HTML can be a good first step in improving the situation for readers.

Streamlining print publishing processes

Most print-heavy environments can benefit from improvement in the print production processes. Look for opportunities in the following main areas:

  • Templates.
    Are you using templates efficiently? Your print production tool should, at a bare minimum, provide page templates and paragraph templates. Many tools go further with inline styles, table styles, object styles, and more. Building out a template that supplies correct formatting and teaching everyone to use the template can save hours and hours of production time.
  • Appearance versus reusability.
    Be careful about print-specific tweaks that damage the usability of your content in other formats. The canonical example of this is hyphenation. When reading a book on my e-reader, I often encounter words that have hyphens in the middle of a line, like this:

    Why is there a hy-phen here??

    The hyphen was almost certainly inserted into the original document to improve the line breaks in the printed document. A better solution is to use a discretionary hyphen (better print publishing programs support them). Discretionary hyphens are displayed only when a word needs to be hyphenated (occurs near the end of a line). Random hyphens scattered through the ebook are artifacts of a print-centric process.

    The content creation process needs to address appearance requirements and reusability across different formats.

Appropriate content

Are you providing the content that your readers need? You can explore this question by reviewing web analytics (if you have web content) and by examining the technical support situation. Does the tech support organization have a list of frequent problem topics? Is tech support creating additional content to address deficiencies in the technical content?

Accommodating translation requirements

If your content is translated, you can greatly improve the translation/localization process with some simple fixes to the source content. (Bill Swallow has a great article on five localization problems.) Start thinking about the following:

  • Consistent wording
  • Template-based formatting
  • Use of culturally neutral graphics
  • Technical quality of files (how are files assembled? Are language layers separate from images?)

All of these steps will improve your content without a requirement for a major strategic initiative.

Here are some things you probably cannot do without a big project (and maybe help from Scriptorium):

  • Implement intelligent content
  • Build out sophisticated reuse with metadata and a formal taxonomy
  • Reassess your tools/technologies and overall workflow
  • Get buy-in across the organization for a major content initiative

Adapting content for the U.S. market (presentation summary)

November 17, 2014 by

In this presentation delivered at tcworld 2014 in Stuttgart, Alan Pringle and Sarah O’Keefe discuss several factors that are required to adapt content for the US market. This presentation is especially relevant for European companies that want to enter the US market.


The primary language of the United States is English. For business-to-business sales, use of British English might be good enough, but consumer products typically need U.S. English. The more personal the product, the more important it is to get the nuances of culture and language exactly right. Cell phones, for example, are very personal whereas accounting software used in an office is less personal.

In addition to English, it’s important to take into account the other languages spoken in the U.S. Approximately 60 million people in the U.S. speak Spanish at home, and half of them don’t speak much English. (Source: article with lots of fascinating language maps)

Culture references

Be very careful with culture references. The people and concepts that are immediately familiar in one culture are often unknown in a different culture. Even within a single country, there can be vast cultural differences–New York City residents have very little in common with Flagstaff, Arizona residents.

Regulatory requirements and legal issues

The U.S. regulates content for a few industries, such as aerospace, nuclear power, and medical devices. The regulatory framework in the European Union is much stronger. In the U.S., product defects and product liability are mainly handled through the legal system. Providing content with extensive warnings and cautions is often a defensive legal strategy rather than an attempt to deliver useful information.

The content standards that are commonly used in Germany are unknown in the U.S.


In an industrial setting in Germany, content providers can assume a certain level of training and/or certification. Germany has a strong apprenticeship program and vocational training. In the U.S., it is very common to have only minimal training in an industrial setting. It may be necessary to provide basic information in the U.S. content that is omitted for the better-trained German audience.

The audience for a U.S. product is likely to be more diverse than a European audience. Expect much wider variance in experience levels, language skills, literacy, education, and training.

Customer experience

A renewed focus on customer experience in the U.S. has led to the following assumptions:

  • Technical content is not just post-sales content. Around 80% of U.S. customers research products before buying them, and their research often includes technical information. Therefore, technical content can drive (or hinder) sales.
  • Repeat business is contingent on customer satisfaction. If the technical content delivered with the product is not of high quality, customers may think twice before buying again.
  • The line between marketing content and technical content is blurring.

    Customer support

    Technical content is often used in customer support. Consider the needs of the support organization in building out the technical content.

Risky business: The challenge of content silos

November 10, 2014 by

At Information Development World, I delivered a keynote on the challenges of content silos. The silo problem emerged as a major theme of the conference.

Presenters such as Janice Zdankus of Hewlett-Packard provided data that explains how silos affect customer experience and how those negative experiences in turn result in lost revenue. My job was to tell the story of bad customer experience.

Buying a roof rack

Screen Shot 2014-11-06 at 1.59.59 PM

Yakima has a wonderful product configurator. You tell Yakima the make, model, and year of your car, along with the type of item you want to convey (bikes, kayaks, skis?), and it tells you exactly what components you need for your car.

And of course there is a big shiny Buy button at the end of the process.

Screen Shot 2014-11-06 at 2.04.02 PM
After you buy the rack, the pieces show up at your door, and it appears that Yakima’s interest in a delightful customer experience ends. After all, they have my money. The product installation is ugly and requires a tedious lookup in other documents.

The contrast with the sales content is quite marked.

Yakima does provide a web-based lookup tool for the needed measurements, but I did not see a reference to this utility in any of the documents I received with the roof rack parts.

Obvious bad service

Another customer experience example comes from Comcast (of course). In a lengthy and entertaining rant, Staci Huckeba has this to say:

Nobody in the “the customer has a problem department” can do everything like they can in the “the customer wants to buy something department.”

Broken web sites

Recently, the Internet at our office stopped working. We tried the usual, obvious stuff (reboot the router), but that didn’t work. So the next step was to see whether the provider (Windstream) was having a wider outage. It turns out that finding the network status/outage reports on Windstream’s site was nearly impossible. A simple Google search eventually revealed that Windstream does in fact have an outage page, but it was thoroughly hidden from the normal site navigation.

The problem here is that the web site is intended as a sales tool and not as a support tool. Even though Windstream has a “support” area on its web site, it does not provide easy access to the outage information.

In the recording, available in a link at the end of this article, I provide some more details and examples of cases where content silos result in inconsistent and generally infuriating customer experience.

What to do

I have three major recommendations to address the content silo problem:

  • Integrated content development and delivery
  • Break down the organizational barriers
  • Break down presentational barriers

These are simple concepts, but executing them well is of course challenging. If you need help, contact us.

The recording of the session is available on the Content Wrangler’s BrightTALKchannel. Janice Zdankus opens the session, followed by Lee LeFever of Common Craft, and then my part starts at around 72 minutes.

Content strategy vs. the undead

November 4, 2014 by

Implementing a content strategy can involve overcoming many challenges. Some of these challenges can be quite scary and hazardous to your strategy. In fact, overcoming these challenges is a lot like battling the undead.

Folklore and Hollywood portray the undead as hideous, evil creatures that we should fear and avoid. Unfortunately for content strategists, the undead that we encounter are unavoidable and need to be dealt with. Ignoring them or working around them can be detrimental if not fatal to your strategy. As silly as it may seem to compare workplace challenges to the likes of zombies and mummies, the similarities in their basic traits are uncanny.

Lego zombie hunter; image via Kenny Louie on Flickr (Creative Commons)

When battling the undead, make sure you have the right tools for the job.

I recently spoke at LavaCon about these ghoulish nasties, and afterward several people approached me to relay their own stories. While they found the undead comparisons amusing, it did help them identify and categorize some of the issues they had faced or are faced with now.

So what are these so-called undead creatures in the content strategy world? I’ve broken them down into five categories.

Zombies represent a persistent resistance to change and comfort in monotony. Their will is strong and mindset is contagious. Rather than spending your energy battling them, find out what drives them and harness that in your content strategy. They may have good intentions that you can build upon.

Vampires are very charismatic and feed off others for personal gain. They are often influential and will make promises so long as it benefits them to do so. As with the traditional monsters, the vampires we face in the content strategy world can be kept at bay if not defeated by shedding a bit of light on the situation. Connect your strategy to your company’s core mission and bottom line, and redirect the conversation back to this core intent.

Mummies are sleeping guardians that awaken when they perceive a threat to their charge. If you plan on replacing existing technology or significantly changing workflows, you may find yourself faced with opposition. Rather than confronting these mummies directly, acknowledge the successes that the outdated tools or workflows have brought. Ask the mummies to assist you in furthering that success.

Frankenstein’s creature can best be described as a patched-together mess of a solution that can become extremely unwieldy without constant attention and care. The creature could manifest as tools or files that are being used (or misused) beyond their intended capabilities, or as a workflow that is more convoluted than beneficial. Whatever they may be, they have one key thing in common: someone at some point thought it was a good idea and implemented it. As a content strategist, you need to undo the harm that these implementations have caused. Perhaps even more important, you need to prevent yourself from being a mad scientist like Victor Frankenstein when implementing your own strategy.

Ghosts are the fears and regrets that haunt us. We all have them in some form. It could be a past mistake that you hope not to make again, the fear of change or how it might be received, or the realization that the strategy you are implementing needs to change. Remind yourself that your strategy will certainly change over time as new tools emerge and as new information is discovered. As for the ghosts from past experiences, set them free by acknowledging and learning from them.

Have you grappled with the undead? Please share your stories in the comments.

XML product literature

October 27, 2014 by

Your industrial products become part of well-oiled machines. Unfortunately, your workflow for developing product literature may not be as well-oiled.

Using desktop publishing tools (such as InDesign) to develop product literature means you spend a lot of time applying formatting, designing complex tables, and so on. These time-consuming, manual chores:

  • Lengthen the amount of time it takes to get information to customers
  • Make it difficult to update information quickly
  • Provide many opportunities for introducing errors into content

XML-based workflows can solve these challenges in the development of product literature for machinery and industrial components. This post provides three examples of how XML can improve your processes for developing product literature:

  • Creating specification sheets
  • Managing common content across models, product lines, and departments
  • Handling OEM rebranding

Creating specification sheets

Putting together spec sheets and datasheets in a traditional desktop publishing environment is just painful. It’s easy to introduce significant errors by merely transposing digits in a part number, for example, and let’s not get into the horror of composing tables in a DTP tool. By the time you finish the layout, the information may be outdated—and customers haven’t even seen it yet!

Maria robot from movie Metropolis

“Machine Human” Maria in Metropolis (1927)

Part numbers, product dimensions, and the like often exist in a database (or multiple databases). It’s better to extract that content as some kind of markup language from the database and then and insert it into the source files for product literature.

The exact process will vary depending on the database and other tools involved, but generally, you want a workflow that extracts the information from the database and formats it automatically. By eliminating the need for human intervention (typing information, applying formatting, and so on), you reduce the possibility of introducing errors and shorten the amount of time it takes to get content to customers.

Because the workflow is automated, you can also release updates more frequently. If you release specs or datasheets in electronic format (web pages, for example), you could set up nightly updates to distribute the latest information.

Managing common content across models, product lines, and departments

The different models of a product usually have shared features, and those common features can stretch across product lines that contain the same parts.

In a traditional desktop publishing environment, it’s very easy to end up with multiple versions of content about a particular part or feature because there is no “single source of truth.”

A modular content workflow eliminates this problem: you develop chunks of content and mix and match them for a particular information product (a user manual or web page, for example) according to product features. Generally, a component content management system (CCMS) manages the chunks, and authors can search the CCMS to find the modules of content they need.

Sharing content modules has two big benefits: the reuse means you’re reducing the amount of time it takes to develop content, and you present customers with consistent information within and across product lines.

Content chunks can also be shared across departments. For example, a table with specs for a part can appear in a user guide, a trade show handout, and on the web site. Even though that table may be presented with different formatting in those information products, the XML source is still the same for all. That’s the great benefit of XML-based content: formatting (usually applied through automated processes) is completely separate from the content itself.

You really need XML at the core of your content to implement industrial strength (ahem) modular processes. One XML content standard, the Darwin Information Typing Architecture (DITA), is specifically for developing modular technical content. Even if an XML standard isn’t an exact fit for your requirements, you can adapt and modify it. After all, the X in XML stands for “extensible.”

Handling OEM rebranding

If your company provides components to other companies in an OEM relationship, an XML workflow streamlines the rebranding of content.

The separation of content and formatting inherent in XML workflows means you don’t have to open up and modify multiple source files to change logos, corporate fonts and colors, and so on. Instead, you create a new automated formatting process (possibly using your company’s transformation as a starting point), or you apply the other company’s existing formatting transformation if they are already in XML. The correct formatting is applied automatically, saving both companies a great deal of time—and that automatic formatting means you and your partner dramatically shorten the time to market for OEM equipment. By the way, all this talk about the separation of content and formatting has another huge benefit: decreased localization costs and faster release of localized content because you eliminate the manual reformatting work associated with translating content.

XML workflows also provide mechanisms for quickly switching out company and product names, addresses, and so on. The modular nature of many XML workflows also enables a partner company to select just the chunks of information they need about an OEM component.

Even if two companies are using two different “flavors” of XML, scripting can automate conversion. It is much easier to convert XML to XML than to convert content in one desktop publishing program to another.


Desktop publishing tools are wonderful for creating visually rich information. But for product literature, you need a system that produces attractive content, speeds up content production, eliminates tedious reformatting work, and streamlines translation.

XML is a better fit for product literature.

Need more information about how XML product literature can help your company? Contact us.

DITA localization for output (premium)

October 20, 2014 by

The first step in DITA localization is to translate the actual content of your DITA files. The second step is to address DITA localization requirements for your output. This article provides an in-depth explanation of the localization support in the DITA Open Toolkit.

The DITA Open Toolkit (DITA OT) includes several DITA localization features. When you set up your publishing system (and whenever you add new languages), you need to do the following:

  • Check the language-specific strings files
  • Ensure that language- or locale-specific images are accessible
  • Select typefaces for the target language

(Most of the information in this post applies to all versions of the DITA Open Toolkit. Information about specific file paths applies to the DITA OT version 1.8.)

Trays of Chinese movable type.

Not all translatable strings are found in your content [Flickr: othree]

Check the language-specific strings files

When generating output, the DITA OT inserts text strings, such as “Chapter” or “Appendix”, types of admonitions (“Note”, “Warning”, “Caution”), text and slogans on the cover pages, and copyright messages. When the output is intended for a specific language, these pieces of text must match the output language. You want “Chapter 4” to render as “Capítulo 4” in Spanish, as “Chapitre 4” in French, or as “第4章” in Japanese.

To handle this, the strings used by the DITA OT are externalized, that is, they are stored in language-specific files that are separate from the rest of the XSL transforms. Each language (or language and locale) has one or more separate files. Usually, a core plugin provides a base set of strings, then plugins that are built on that core plugin can add their own strings. Within these files, each string has an identifier, which is not translated, and the string itself.

A large number of these strings are provided by the core DITA OT. For HTML-based transforms, the DITA OT supplies strings files for over 50 languages and locales; for PDF, support for 14 languages is included.

The default translated strings may not meet your needs. The words used in the strings may not align with the word choice, tone, emphasis, or punctuation your organization requires. Also, the PDF strings files are not consistently populated; all of the strings in the English strings files may not be translated in the strings files for other localizations.

Additionally, there may be some strings for which there are no definitions in the core plugin strings files.

Work with your localization team to check the locale-specific strings files provided by the DITA-OT. You may have to do this for strings used with core HTML and PDF plugins. If the editor or language checker recommends a change, you (or the localizer) should:

  • Identify the strings in the core strings files that you need to change.
  • Copy the elements that define those strings to the corresponding plugin strings file.
  • Change the string definition in the copied element to the new string.

When generating output for new localizations, check the DITA OT log file for missing string errors. These will be in the target “transform.topic2fo.main” with the task identifier “[xslt]”. If you find that there are missing strings, you’ll need to add them to the plugin strings file, using the English definitions as a basis for the translation.

File structure for HTML strings files

As of DITA OT version 1.8, the language-specific strings files for the core HTML-based transforms are stored in %DITA-OT%/xsl/common. The file names are in the form strings-xx-yy.xml, where xx-yy is the language identifier as defined by IETF RFC 4646 and implemented by the ISO 639-1 language codes (this is the same language code as used in the xml:lang attribute). An additional file strings.xml (in the same folder) lists the language files that are currently in use.

Each HTML strings file has the form:

<?xml version="1.0" encoding="utf-8"?>
<strings xml:lang="xx-yy">
   <str name="identifier">String</str>

Note that the file’s root element (<strings>) contains the xml:lang attribute, which specifies the language (as does the name of the file). Within the root element are one or more <str> elements. Each <str> element has a unique identifier (name attribute); contained in the <str> element is the text that is pushed into your output. The contents of the name attribute should NEVER be translated.

The file strings.xml has the form:

<?xml version="1.0" encoding="utf-8"?>
   <lang xml:lang="xx-yy” filename="strings-xx-yy.xml"/>

The strings.xml file contains one lang element for each supported language.

File structure for PDF string files

As of DITA OT version 1.8, the strings files for the core PDF-based transforms are stored in %DITA-OT%/plugins/org.dita.pdf2/cfg/common/vars. The file names are in the form xx.xml, where xx is the language identifier as defined by IETF RFC 4646 and implemented by the ISO 639-1 language codes.

Each PDF strings file has the form:

<?xml version="1.0" encoding="UTF-8"?>
<vars xmlns="">
   <variable id="identifier">String</variable>

Each file contains one or more <variable> elements. Each <variable> element has a unique identifier (id attribute); contained in the <variable> element is the actual string. Some PDF strings may include one or more parameters which allow the transform to insert text into the strings. For example, the Italian strings file contains this entry for a figure title:

<variable id="Figure"> Figura <param ref-name="number"/>: <param ref-name="title"/></variable>

Note that the variable id attribute and the param element’s ref-name attribute should NEVER be translated.

Make sure the translator understands that their job is only to translate the contents of the <str> or <variable> elements. They should not translate the attributes (apart from modifying contents of the xml:lang attribute), nor should they translate the comments (any text surrounded by “<!–” and “–>”).

Additionally, within the strings, there may be spaces or non-breaking spaces (usually represented with the entity “&#160;”), these should remain just as they are in the original (as much as possible).

Most strings files contain comments and notes to the translator. In particular, some strings files contain paths to images; most of these are accompanied by a note NOT to translate the paths.

Additionally, the strings files may contain URLs for partner organizations or language- or locale-specific web sites. You may want to examine the contents of the strings files and determine which URLs should be made locale-specific and which should be left untouched.

When the strings files are returned from the translator, add the translated (and renamed) strings file to the plugin folders as described.

For HTML-based plugins you must also:

  • Ensure that the translator correctly modified the xml:lang attribute to the <strings> element in the file containing the translated strings.
  • Update the plugin-specific strings.xml file so that it contains a reference to the translated strings file. (You should run the integrator after updating this file.)

For PDF-based plugins you must also:

  • Ensure that all strings in the English strings file exist in the strings file for your localization. If they don’t you’ll need to provide these strings in your plugin’s string files.

Ensure that locale- or language-specific images are available

Just as the DITA OT inserts strings into output when necessary, it can also insert icons and other images as required; for example, icons for admonitions (notes and hazard statements) and company logos in page headers or footers.

Most icons and images are intended for use in all languages. But sometimes, specific icons are required for a locale or language. These reasons may include:

  • Icons or images that include language-specific text
  • Icons or images that are culturally sensitive

What do you have to do?
If you need to substitute images based on the output language, do the following:

  • Ensure that locale- or language-specific image files are available in the appropriate artwork folder
  • Ensure that the paths to the output location of these image files are saved as strings in the language-specific strings files. Generally, the path to each image will be the same except for the file name.

Select typefaces for the target language

To generate PDF files, the transforms need typeface specifications. The DITA OT allows us to define classes of typefaces (“logical fonts”) that are associated with specific types of text. For instance, you might define that your body text uses a serif font, titles use a heavy-weight sans serif font, and that running heads use a lighter form of that same sans serif font.

Each of the logical fonts is associated with a physical font. The physical fonts are often determined by the style guidelines for your company or organization; they ensure that your information products project a consistent look and feel.

The fonts you select must support all characters used by the target localization.

If you are creating a localization for a language that requires extensive use of a non-Western character set, you may need to:

  • Identify typefaces that are associated with your organization’s look and feel in specific locales.
  • Specify how those typefaces are to be associated with specific text applications. That is, the fonts that will be used for body text, titles, heads, and so on.


When localizing your DITA content, remember that DITA OT plugins do contain localized information. The strings, images, icons, and fonts that are a part of your final work products must be translated or localized with the same care and cultural sensitivity as your content.

For more information about localizing your plugins, localizing your content, or developing a content strategy to facilitate localization, contact us at

The case for XML marketing content

October 14, 2014 by

What’s the first thing that comes to your mind when I say “XML and content”? If large technical documents and back-end databases pop into your mind, you’re in good company. But many content-heavy groups can benefit from adopting XML. Marketing is one of these groups.

If you’ve ever worked in or with a marketing department, you are painfully aware of the vast amount of content that’s produced. From web and social media to brochures, catalogs, and product sheets, marketing content comes in all shapes and sizes.

To be effective, marketing content needs to echo similar information yet cater to varying audiences or suit different purposes (product sheets, web content, and other promotional material, for example). The design of the finished content may widely vary, but the information needs to be current and accurate.

This can be quite a lot to manage—regardless of how large or small the team—and takes time and diligence to ensure that the correct information is used at all times. Even with centralized project folders and shared information sources, the chance of human error is high. One change, such as a version number or a small product update, may need to be made in dozens of places. Chasing all of these uses down is time consuming and inefficient.

One benefit of XML marketing content is the separation of form and content. This allows you to focus on the message and not the look and feel of one particular final product. Through meaningful tagging, you can mark up your content in a meaningful manner (“product tag line” vs. “14pt italic”) and then render it in a variety of ways. The focus is on the content itself, leaving the look and feel to templates and transforms. Once you flow the content into the template, you can still modify the formatting while reaping the benefit of managed, centralized content.

Batman slapping Robin; XML marketing content can help you break the copy/paste habitAnother benefit is content reuse and the conditional inclusion and exclusion of content. Not all content is created equal; sometimes you need to omit some information in favor of other content. You could certainly manage this with cut and paste and a bit of editing, but then you’d be managing multiple sets of content. With all content in one place and tagged for specific uses, you can assemble what you need through content references and conditionally exclude portions that you don’t need.

If you localize your content, you’ll benefit from significant cost savings. Content reuse not only reduces the number of words requiring translation; it can reduce the chance of fuzzy matches against translation memory that are usually introduced by formatting inconsistencies and manual line breaks. And, since XML is raw text, there are no DTP-associated costs or delays accompanying your translation.

An XML workflow can benefit any group with many content deliverables, hefty translation requirements, and the need to reuse information in multiple places. If you’re feeling overwhelmed with your existing workflow, contact us to see if XML might be a good fit for you.