Scriptorium Publishing

content strategy for technical communication

Strange bedfellows: InDesign and DITA

January 27, 2014 by

or, What you need to know before you start working on a DITA to InDesign project.

There are a lot of ways to get your DITA content rendered into print/PDF. Most of them are notoriously difficult; DITA to InDesign, though, may have the distinction of the Greatest Level of Suck™.

InDesign XML formats

InDesign provides several XML formats. InDesign Markup Language (IDML) is the most robust. An IDML file is a zip container (similar to an EPUB). If you open up an IDML archive, you’ll find files that define InDesign components, such as pages, spreads, and stories. If you save a regular InDesign file to IDML, you can reopen the IDML file and get back your InDesign file, complete with layouts, graphics, formatting, customizations, and so on.

IDML is both a file format and a markup language. The IDML language is used inside the IDML file. In addition, a subset of IDML markup is used in InCopy files (ICML). Where IDML can specify the entire InDesign file, ICML just describes a single text flow.

(There is also INX, but that format is for older versions of InDesign and has now been deprecated.)

If you are planning to output from DITA to InDesign, you probably want ICML. The IDML language is used in both IDML and ICML files. The IDML specification is available as a very user-friendly PDF on Adobe’s site. I spent many not-glorious hours plowing through that document.

My best tip: If you need to understand how a particular InDesign component is set up in IDML, create a small test file and then save the file out to InCopy (ICML) format. This will give you an almost manageable snippet to review. You’ll find that InDesign includes all possible settings in the exported file. When you create your DITA-to-ICML converter, you can probably create a snippet that is 90 percent smaller (and includes much less stuff). The challenge is figuring out which 10 percent you must keep.

Understanding the role of InDesign templates

Use an InDesign template to specify page masters, paragraph styles, character styles, tables styles, and more. This template becomes your formatting  specification document.

To import XML content, do the following:

  1. Create an ICML/IDML file that contains references to paragraphs and other styles (more on this later).
  2. In InDesign, open a copy of the template file.
  3. Place the ICML file in your template copy. The style specifications in the template are then applied to the content in the ICML and you get a formatted InDesign file.

Of course, this nifty three-step procedure elides many months of heartbreak.

The mapping challenge

A basic paragraph, in DITA, looks like this:

<p>Paragraph text goes here.<p>

The equivalent output in IDML is this:
<ParagraphStyleRange AppliedParagraphStyle="ParagraphStyle/body">
   <CharacterStyleRange AppliedCharacterStyle="CharacterStyle/$ID/[No character style]">
     <Content>Paragraph text body goes here.</Content>

Some things to notice:

  • The inline formatting (CharacterStyleRange) is specified even when there is no special formatting.
  •  The content is enclosed in a <Content> tag.
  • The <Br/> tag toward the end is required. Without it, the paragraphs are run together. In other words, if you do not specify a line break, InDesign assumes that you do not want line breaks between paragraphs.
  • Extra whitespace inside the <Content> tag (such as tabs or spaces) will show up in your output. You do not want this.
  • Managing space between paragraph and character tags is highly problematic.

Other important information:

  • You must declare the paragraph and character tags you are using at the top of the IDML file in the RootParagraphStyleGroup and RootCharacterStyleGroup elements, respectively.
  • <RootCharacterStyleGroup>
       <CharacterStyle Self="CharacterStyle/$ID/[No character style]" Name="$ID/[No character style]"/>
       <ParagraphStyle Self="ParagraphStyle/body" Name="body"/>

  • You cannot nest character tags in InDesign. Therefore, if you have nested inline elements in DITA, you must figure out how to flatten them:
  • <b><i>This is a problem in InDesign</i></b>

    <CharacterStyleRange AppliedCharacterStyle="CharacterStyle/BoldItalic">
          <Content>You have to create combination styles in InDesign.</Content>

  • Generally, you will have more InDesign paragraph styles than DITA elements because DITA (and XML) have hierarchical structure. For example, a paragraph p tag might be equivalent to a regular body paragraph, an indented paragraph (inside a list), a table body paragraph, and more. You have to use the element’s context as a starting point for mapping it to InDesign equivalents.
  • In addition to using hierarchical tags, if you want to maintain compatibility with specializations, you must use class attributes rather than elements for your matches. That leaves to some highly awkward XSLT templates match statements, such as:
    <xsl:template match="*[contains(@class,' topic/ul ')]/*[contains(@class,' topic/li ')]">

  • In addition to paragraph and character styles, you need to declare graphics, cell styes, table styles, object styles, and colors. (There may be more. That’s what I found.)


Tables are not your friend. InDesign uses a a particularly…unique table structure, in which it first declares the table grid and then just lists off all the cells. The grid coordinates start at 0:0. (Most “normal” table structures group the cells into rows explicitly.)

<Table TableDirection="LeftToRightDirection" Self="aaa" ColumnCount="2"
         <Row Self="bbb" Name="0"/>
         <Row Self="ccc" Name="1"/>
         <Column Self="ddd" Name="0" SingleColumnWidth="100"/>
         <Column Self="eee" Name="1" SingleColumnWidth="100"/>

         <Cell Self="fff" RowSpan="1" ColumnSpan="1"
            <ParagraphStyleRange AppliedParagraphStyle="ParagraphStyle/cell_center">
               <CharacterStyleRange AppliedCharacterStyle="CharacterStyle/$ID/[No character style]">
                  <Content>first cell content goes here</Content>
          [...much more here...]

As you can see, this gets complicated fast.

There is so much more, but I think you get the idea. It is definitely possible to create a DITA to InDesign pipeline, but it is challenging. If you are looking at a project like this, you will need the following skills:

  • Solid knowledge of InDesign
  • Solid knowledge of DITA tag set
  • Ability to build DITA Open Toolkit plugins, which means knowledge of Ant and XSLT at a minimum

The open source DITA4Publishers project provides a pipeline for output from DITA to InDesign. We looked at using it as a starting point in mid-2013. At the time, we found that it would be too difficult to modify DITA4Publishers to support the extensive customization layers required by our client.

Our DITA to InDesign converter is a DITA Open Toolkit plugin (built on version 1.8). It supports multiple unrelated templates for different outputs and specialized content models. It also includes support for index markers, graphics, and other items not addressed in this document. Scriptorium is available for custom plugin work on InDesign and other output types. For more information, contact us.


Trends in technical communication, 2014 edition

January 21, 2014 by

Our annual prognostication, along with an assessment of our predictions from last year.

2013 in review

Our predictions from 2013:

  • Velocity: the requirement for faster authoring, formatting, publishing, delivery, and updates is forcing tech comm into significant changes. This was a theme in our presentations and consulting projects this year.
  • Mobile requirements change tech comm. This is happening, but more is needed.
  • Rethinking content delivery. Again, I’d like to see more.
  • Bill had PDF continuing to thrive, and continued growth of localization requirements, along with another trend related to mobile.

Looking back, these predictions seem quite cautious, but I think that are largely accurate as trends.

On to 2014…


Trend 1: People like their silos.

four silos in a row

flickr: docsearls

Despite increased talk about breaking down silos, people still like them. Working in silos gives those working within them and managers overseeing them a sense of control over their content, whether real or perceived. Technology investments within individual silos will continue for the foreseeable future. This technology will need to “talk” to the technology used within other silos—using a common format—to efficiently share content. While people may like their silos, executive management is growing less fond of them, regarding them as roadblocks to collaboration and contributing factors to excess overhead costs. Looking forward, we will start to see technology investments across or outside of silos that further centralize content management and ease the burden of reusing content across groups.


Trend 2: Reorienting toward a customer perspective of content

looking down into a silo

flickr: neurollero

Organizations are beginning to look at content from the customer’s point of view. Customers just want information; they don’t care about the internal differences between a knowledge base article and a topic from tech comm. As a result, the pressure to integrate these diverse roles and deliver unified information (usually to the corporate web site) is increasing. Executives are aware of these issues, and they do not want to hear about how the internal structure of an organization makes it problematic to deliver what they want.


Trend 3: Blurring of tech comm, marcom, and content strategy

blurred silos at night

flickr: mahalie

The term “content strategy” has been heavily used by those working in tech comm and marcom over the past few years, though the focus has differed. In the scope of tech comm, content strategy primarily involves the process of creating, managing and producing content. The marcom focus of content strategy has traditionally been on audience engagement. The line between these two camps is now blurring. Tech comm is increasingly interested in and responsible for tracking the effectiveness of content, and marcom has an increased awareness of the content management lifecycle and of localization requirements. We’ll see an increase in collaboration between tech comm and marcom as the lines continue to blur, and perhaps a new discipline will emerge as the trend continues.


Trend 4: Apps are winning over HTML5

brick silo

flickr: tinfoilraccoon

This one requires context. A lot of our customers are building apps for their content. In particular, they are using apps for information that is delivered to staff technical support or field services people—employees who go to customer sites and install or fix things. For that specific use case, most companies are opting for an app because:

  • The company provides the employee with a device (usually a tablet) with the technical content. As a result, the target device is known—the company can create a single app on a single operating system.
  • They want functionality, especially integration with other systems, that is easier to achieve with an app than with HTML5.
  • They like the control provided by the app update process.

Does this fly in the face of the bring your own device (BYOD) trend? Yes, it does.


Trend 5: Lots of creativity in output—not source

Silos with art and a QR code

flickr: cstreetus

Business and consumer demands drive content requirements, and the rate at which these demands change is increasing. New apps, new delivery formats, and new use scenarios are constantly introducing new requirements. To respond to these evolving requirements, content developers will need to get creative on the output side while standardizing on a simpler, cleaner source. This will ensure that content is poised and available for transformation and production to required formats.


Trend 6: Content can be an asset—or a liability

Looking at sky from inside silo

flickr: alisonpostma

Like other content strategists, we have been arguing for some time that content is a corporate asset and should be managed properly. But “content is an asset” doesn’t tell the whole story because the corollary is that:

Bad content is a liability.

Organizations are beginning to recognize that their content can help or hurt them. They ignore content at their peril.

What do you think of our trends? Did we miss one?

Webcast: Trends in technical communication 2014

January 17, 2014 by

In this webcast recording, Sarah O’Keefe and Bill Swallow of Scriptorium Publishing discuss what’s new in technical communication. Alan Pringle moderates.

Trend 1: People like their silos.
Trend 2: Reorienting toward a customer perspective of content
Trend 3: Blurring of tech comm, marcom, and content strategy
Trend 4: Apps are winning over HTML5
Trend 5: Lots of creativity in output—not source
Trend 6: Content can be an asset—or a liability

“You don’t own me—or your content!”: the motivated consumer’s mantra

January 13, 2014 by

I love Downton Abbey. I love my Honda Fit.

And I will consume content about those things—even when their creators would prefer I not.

Today’s digital world means that information is global. Sure, ITV can release Downton Abbey in the UK during the last quarter of a year, and PBS can rebroadcast it in the US just a few weeks later. But if ITV and PBS think they can keep the show from American audiences during the initial UK broadcasts, they are very mistaken.

I saw information about the fourth season of Downton Abbey in my Facebook news feed while it was airing in the UK, and headlines about the show from UK-based sites made their way into my RSS feeds. Also, I’ve heard it’s not too hard to find the latest episodes on various not-entirely-legal file sharing sites. Die-hard fans in the US aren’t going to let intellectual property concerns block their immediate access to the Dowager Countess’s sharp tongue and poor Edith‘s latest jilting. Many folks over here watch the show pretty much in real-time as it airs in the UK, even though I’m sure PBS would strongly prefer that Americans watch the official US broadcasts—which seem to correspond with pledge drive season (ahem).

Meanwhile, in Japan, Honda released the third generation of its compact hatchback, the Fit (known as the Jazz in some markets),  in September. That new model won’t be available in the US until later this year, and Honda just launched a site to give US residents a glimpse of the new model. If I kindly give Honda my email address and phone number, they will send me more information on the new model. Oh really, Honda?

I don’t need to cough up my personal information to see photos and specs for the new Fit. I can check out non-Honda sites and blogs (such as this post from July 2013) to get information on the car and what I can expect in the version released in the US.

As content creators and managers, we have to realize that motivated consumers are going to ferret out information through unofficial (and even legally dubious) channels—particularly in situations where releases are staggered across different geographical markets. While there are legitimate reasons for releasing products at different times across world markets (and localization lag time is not a good reason, BTW), our product and content strategies must account for those who really want information—and don’t give one whit if it comes first from an unofficial source.

You cannot control the flow of content once it’s out there or pretend like it doesn’t exist because you didn’t release it. Copyright issues aside, you can’t hold on to antiquated notions about “owning” information in the Internet age. Consumers will speed right by you if you’re not giving them the information they want when they want it—and they may never turn back to your official content again.

P.S. My apologies to Lesley Gore for appropriating the title of her hit song:

Will it blend? Content management software and localization companies

January 6, 2014 by

Vasont, TransPerfect, and Astoria. Really??

Disclaimer: This post is complete speculation. I have no useful inside information to work with regarding the merger.

As you may have heard, TransPerfect recently acquired Vasont. (The press release uses words like “merge” and “integrate” and carefully avoids the A-word.) This is the second component CMS that TransPerfect has acq…merged with in the past few years. The first one was Astoria in 2010.

Thus, TransPerfect now has a lineup of localization services, translation management software, and two component CMSes.

Localization service providers (LSPs) are facing a generally difficult market—there’s a ton of demand for localization, but vendors are squeezed because of the following factors:

  • Most customers focus on price (pennies per word) and not quality.
  • Increased use of machine translation (sometimes on the client side, sometimes on the vendor side).
  • Increased use of automated formatting (based on XML), which greatly reduces the revenue stream from desktop publishing.
  • Use of technologies that support incremental translation and better pre-translation matching (thus reducing the total number of words to be translated by the vendor).

Given these challenges, it seems logical to extend the LSP’s revenue stream with any or all of the following:

  • Localization software, such as translation management and terminology management systems
  • Content creation software, such as content management systems
  • Professional services, such as systems integration, content strategy consulting, and so on

Actually making this happen is challenging for the LSP because:

  • Selling enterprise software is different from selling localization services.
  • The LSP must become a trusted partner rather than a commodity supplier.
  • To sell software or services related to content development, the LSP must be involved at the beginning of the content lifecycle. Most localization services are sold at the end of the content lifecycle.
  • Most clients have separate content and localization roles, which makes it difficult for the LSP to cross the gap from the (usually late in the cycle) localization manager to the (usually early in the cycle) tech comm or marcom manager.

If the general strategy is “move upstream in the content lifecycle,” then acquisition of content-development technologies makes a whole lot of sense. What seems weird to me is the acquisition of two component CMS companies. Why would TransPerfect do this?

Disclaimer #2: Transitioning from “pure speculation” to “magical thinking.” Consider yourself warned.

Revenue? I think not.

TransPerfect has been on the Inc. 5000 list as a rapidly growing company for several years running. The company is privately held, but Inc. has their 2012 revenue as $341.3M, up from $220M in 2009. That works out to around 15% annual growth. To keep that growth rate going, TransPerfect would be looking for roughly $50M in new revenue in 2013 and $60M in 2014. It’s extremely difficult to find revenue information about Vasont (and for bonus points, the company has both a sister company and a parent company), but it looks as though revenues are somewhere in the $6M to $7M range based on a couple of moderately sketchy sources. The extreme best case scenario is that Vasont/Progressive/Magnus/Whatever contributes around $10M in new revenue.

Could it be the technology?

After a deeply nonrigorous search, I could not locate any patents for Vasont, Progressive Information Technologies, or Magnus Group. It’s possible that Vasont has developed technology in the CCMS space that is interesting but unpatented.

What about Astoria?

Will TransPerfect maintain two separate CCMSes? This seems thoroughly inefficient, but it’s not clear to me that it’s even possible to combine the two systems into a single one.

TransPerfect could possibly market the two systems differently to appeal to different customers. For example, Astoria might be the SaaS solution and Vasont the on-premises solution. Or maybe one system would be positioned as the “enterprise” system and one as the “small-to-medium business” system.

From a software product management point of view, none of these options makes a whole lot of sense. Even if TransPerfect intends to keep developing both systems separately, they face an uphill battle in convincing potential buyers of their plans. A few years back, SDL acquired two CCMS systems. They repositioned Contenta for “non-DITA” and S-1000D solutions and left Trisoft (LiveContent Architect) in the DITA space, thus separating the two systems by content model and industry vertical. The transition was difficult for some customers.

It must be about localization sales.

I’ve reached the conclusion that this acquisition is about sales. Specifically, it’s about localization sales. If TransPerfect is selling CCMSes to various companies, that provides them with a logical pipeline of prospects for translation management systems and localization services. The $10M or so that each CCMS might produce in annual revenue is simply the entry fee for access to potential new customers for bigger and better things.

In this context, buying up direct competitors and leaving them more or less “as is” makes some amount of sense.

But will it blend??

I’m not sure this is going to work. Even if TransPerfect intends to keep both systems under development, Vasont and Astoria’s competitors will certainly highlight the risk of buying a CCMS that has an in-house competitor.

Combining the two systems and creating something that provides the best of both systems—call it “Vastoria” or “Assont”—would make more sense in the long term. Perhaps they could organize an in-house death match? (“And may the odds be ever in your favor…”)


The temperature check

December 24, 2013 by

This anonymous guest post is part of the Blog Secret Santa project. There’s a list of all Secret Santa posts, including one written by Sarah O’Keefe, on Santa’s list of 2013 gift posts.

I recently took a trip to the emergency room, and there it was: The How Are You Feeling chart. Ten yellow faces, ranging from terrified shriek to cheerful giggle. In case you’re wondering, I picked #7, but that’s neither here nor there.

People don’t always have words for what they’re feeling, but the feelings are still there. And just like they matter when you’re huddled on a plastic chair in the ER, they matter when you’re working on content, too.

We have so much to keep track of in our content strategy projects. Things like: *Who do we want to reach?* and *What kinds of content do we need to reach them?* and *Which resources do we have at our disposal?*

We want our content to be relevant, up to date, aligned with business objectives, and so much more. ALL of those things are important. So how can we integrate feelings? I’d like to suggest something I call the Temperature Check. It’s not quite the same as the How Are You Feeling chart, but it’s close.

I started using the Temperature Check during a recent client workshop, when I noticed people’s expressions changing as we talked through our content plan. Sometimes they smiled and nodded. Sometimes they looked irritated. Sometimes their eyes glazed over.

In a brief burst of inspiration, I pulled up our draft plan and asked them to describe how they felt about each part of it: Love it, Hate it, or Meh.

Hello, feelings. It was like a key that unlocked important new insights into to what we were doing. Though we had already put our plan through several other filters, this exercise helped us to see nuances we had missed before. We took a closer look at the things marked “Hate it” and asked why. Some of them we hated because we’d failed at them before, or our process for getting them done was painful. We talked about how we could change that. Some of them were things we had convinced ourselves we “should” do, which led to an excellent discussion about whether we were being lazy–or if there was a good to reason to eliminate them from our plan. Negative emotions have a whole lot of wisdom in them, if only we’ll listen.

Next, we explored how we could make the Meh stuff better. Here again, we discovered some things we thought we should do, but when we were honest, didn’t want to. We wondered out loud why this was such a pervasive issue. Were we trying too hard to imitate strategies others had found successful, but that didn’t seem quite right for our culture? What were we trying to prove, anyway?

Content ownership issues came up as well. Not everyone had the same feelings about the same things. For instance, someone who felt ambivalent about a particular part of the plan found that someone else was feeling the love. Bingo, new owner.

Finally, we used the things tagged “Love it” to understand what the team was truly jazzed about. It wasn’t just the sparkly stuff – some of us loved the nerdy bits, data and spreadsheets and the like, while others gravitated to the more social elements. And surprise, surprise – that led us back to the beginning, talking about what it would take to move more of the “Hate it” stuff to the “Love it” category.

Did the Temperature Check change our strategy? Well, duh.

People are the ones who make content, and manage it, and share it – and those people have feelings. Feelings that impact the quality of the content they make and maintain, and the effectiveness of all of our strategizing and plotting and planning.What’s happening on your teams? Do you see frowny faces or excited ones? The difference matters.

Give the Temperature Check a try and see what happens.

Marginalizing tech comm with four little words

December 16, 2013 by

“I’m just the writer.”

For your 2014 New Year’s resolution, please stop yourself from verbalizing those words if they pop into your head. I ask this as someone who has both thought and said (doh!) those very words, especially during my early tech writing career.

Many companies are grappling with a common problem in their technical content. The content covers the what but it is light on the why and how. As a result, frustrated customers call support and create a financial double-whammy: a great deal of money is wasted on shallow, unhelpful technical content, and that costly failure is compounded by the additional support costs.

One way to deepen the context in your content is by providing rich, real-world examples that illustrate the why and how. Unfortunately, the only way you can develop those examples is by truly understanding the products you’re documenting.

Does this mean you need to become a product developer? Nope. But you do need to communicate and collaborate with product developers/SMEs more on their level. These days, a lot of SMEs are being asked to contribute content, so it’s not unreasonable for you to contribute your brainpower to a deeper understanding of your employer’s products.

So, next time you feel yourself flailing in a mire of technical detail about the product you’re documenting, take a deep breath and fight the temptation to fall into the trap of “I’m just a writer.” You must be more more than a stenographer and desktop publisher for product developers. Don’t let anyone—particularly yourself—marginalize tech writing in such a manner. If we act (or are merely viewed) as stenographers, there is little value placed on the content we create.

And that is when tech comm becomes expendable.

Cover from record for practicing stenography. Shows record player, typewriter, and rotary phone.

Stenography in tech comm is as forward-thinking as record players, typewriters, and rotary phones (flickr: epiclectic).

Has your localization service provider (LSP) been naughty or nice?

December 11, 2013 by

Last month I posted about the five gotchas that will affect your translation turnaround time. That post focused on content quality, but I’d also mentioned how “a good LSP” would handle things. This month, let’s take a step back and look at five things that separate the nice LSPs from the naughty ones.

Santa Claus

Santa Claus By Jonathan G Meath (Jonathan G Meath) [CC-BY-SA-2.5 (], via Wikimedia Commons

1. Questions

Naughty: A naughty LSP will more often than not skip over all of the pesky project details to get to the word count and source file formats so they can turn around a quick quote. They’ll likely respond to your questions with hollow assurances that everything will be fine, and that they’ll follow up with any questions, if they have them, after receiving your files. The false confidence they exhibit usually masks problems until it’s too late in the project to properly correct them.

Nice: When first engaging a client or beginning a new project, a nice LSP will ask you informed, pertinent questions about your content, your content, and the purpose of the project. They will ask about technology, workflow, review, and other aspects of the project. Their goal is to understand all aspects of the project so they can deliver the best translations possible.

2. Advice

Naughty: You might not think it’s naughty behavior, but a naughty LSP will translate your files just as they are, whether it’s an easy or complex process for them. Why is this naughty? If they are encountering issues, they are not alerting you and explaining better ways of doing things, which could mean that you are paying more in project setup or DTP fees than you ideally should be.

Nice: Nice LSPs are open, honest, and helpful. They will point out issues in files, offer solutions, and often offer several options for their clients to choose from in correcting issues or in avoiding potential problems. Nice LSPs understand that helpful advice fosters collaboration and constructively builds long-term partnerships.

3. Workflow

Naughty: A naughty LSP has their own way of doing things, and will often strongly suggest you conform to their preferred workflow (or face possible additional charges). This may be because of their own technology investments (or lack thereof), or because they have no control over how their translators work. Worse, they may mandate that you conform to their workflow if you want to use them at all.

Nice: A nice LSP realizes that their clients may have spent considerable time and money on implementing technologies and training staff to create content in a certain way. They work to integrate into your workflow, or at least work to meet you part of the way so you can realize the ROI from your chosen workflow.

4. Pricing

Naughty: Naughty LSPs can be sneaky about their service pricing. Their terms may be unclear, and what you may save from a low price per word might be lost many times over via pricey DTP, project management, or file prep charges. Always make sure you have a complete pricing breakdown for your project and be aware of inflated pricing or effort estimates.

Nice: Nice LSPs are completely up front with their pricing models and explain every line item in their quotes. They explain how various approaches to a project, and how potential issues, will affect the pricing. Nice LSPs understand the value of transparency in process and in pricing, and they also understand that clear and honest pricing helps establish long-term business partnerships.

5. Translation Memory Ownership

Naughty: A naughty LSP may have no up front policy about translation memory (TM) ownership, or might either avoid or downplay any conversation about TM ownership. Worse, they might claim outright ownership of it or withhold it! In this case, forget the coal in the stocking and call the Krampus!

Nice: A nice LSP acknowledges up front and in writing that you own your TM and are entitled to receive a copy of it whenever you request one. They understand that they are providing you a service involving your intellectual property.

Minimum viable content

December 2, 2013 by

…in which we explore the idea of minimal viable product as applied to technical content.

You’ve probably heard of minimum viable product, which has “just enough” features. In technical communication, minimum viable content isn’t a new idea—it’s a common survival strategy—although I think the more accurate label would be minimum defensible content.

This Page Intentionally Left Blank But minimum viable content, like its product counterpart, should not be a hastily assembled scrapple (NSFVKHG: not safe for vegetarian, kosher, halal, or gourmet eaters). Instead, minimum viable content should be a strategic decision based on the organization’s overall content strategy and questions such as these:

  1. What are the regulatory requirements for this content?
  2. How does this content help meet the organization’s business goals? What is the purpose of this content?
  3. In what formats is this content needed? In which languages?
  4. Who must create the content?
  5. What is the content velocity? How quickly must it be delivered and how often will it change?

With these and other questions, you can determine your true minimum viable content.

I believe that, for many organizations, delivering minimum viable content would be a long step up from the status quo. I’ll have a lot more on this topic at tcworld India in February.

What do you think? Do you deliver minimum viable content? Or desperately triaged content?


Five gotchas that will affect your translation turnaround time

November 25, 2013 by

Having worked at two translation companies and on many projects requiring localization, I appreciate just how nimble LSPs (language service providers) can be. Their ability to track down translators with the necessary subject matter expertise and handle a vast array of file formats is truly remarkable. That said, localization efficiency is dependent on you, the content provider.

train wreck

flickr: Robyn Jay

Although a good LSP can work with any type of content you might throw at them, their efficiency (and therefore your costs) depend on the source files. From the simplest looking Word file to the most robust XML solution, what’s lurking beneath the surface of these files could make or break your translation deadlines.

1. Frankenfiles

A “Frankenfile” is a term of endearment used to describe a file with absolutely no rhyme or reason behind its formatting (often, it was quite literally hacked and stitched together). When a file is full of style overrides, inconsistent style use, unconventional or inconsistent spacing, hard returns for line wrapping, and other visual formatting “hacks,”  it is incredibly difficult and time consuming to properly translate the content and deliver it with the same look and feel. As a Frankenfile is updated, new hacks (or fixes) are likely to be introduced that reduce translation memory leverage. (Leverage is a measurement of how much previous translation can be reused in a translation update. It is far cheaper to use translation memory than it is to retranslate content.)

The best advice, regardless of what tool you are using, is to strictly adhere to a template or standardized style. Do not use formatting overrides to achieve a specific visual result, and do not create stylistic exceptions for “unique” content unless absolutely necessary AND unless you are committing those styles to the core template used by all files. Consistency is extremely important, particularly if the LSP needs to create custom filters to handle your style conventions in their tools. They can then create  the filters once and reuse them as needed, saving precious turnaround time.

2. CMS output

A content management system is a wonderful thing. It keeps your content neat and tidy in a centralized location, allows you to reuse portions as needed, and likely supports a plethora of workflow automation out of the box. One of these workflows may be translation, but it may not be the workflow that your LSP had in mind. Some CMS have built-in translation UIs, some provide XML output, and others even go so far to supply XLIFF output (often considered a translation-friendly format).

All of these options are valid for a translation workflow, but do not assume that your LSP is ready and able to work in the manner your CMS dictates. You may find that what sounded good on paper doesn’t in fact work very efficiently in practice. Before approaching a localization project, and ideally before selecting a CMS, talk with your LSP to see what workflows might be more optimal than others. Conduct a trial/pilot translation using your chosen workflow to ensure that content can not only be exported, translated, and imported, but can be done so several times as revisions are introduced into the source after each translation cycle.

3. Excessive wordsmithing

Your content will most likely need to be updated over time. Sometimes new information is added, outdated information is removed, and incorrect information is corrected. When you edit content that has previously been translated, every edit comes at a cost; rewritten text, changes in punctuation, and even changes in spacing require a translator’s eye, and will lessen your translation memory leverage.

A best practice within a translation workflow is to only modify what absolutely needs to be updated from release to release. If you missed an Oxford comma or an a/an distinction, stop and consider whether there is value in correcting it, as these seemingly innocent edits can add up over time. Peppering these changes through a document can add an hour or more (I’ve seen cases where an entire day or more was lost to vetting fuzzy matches in a translation memory) to turnaround time.

4. Graphics

Graphics are a tricky beast in the localization world. Two key issues with regard to graphics are 1) translatable text within the graphics, and 2) cultural appropriateness of the images themselves.

The argument against using text in graphics has pretty much been beaten to death, but I mention it now because—believe it or not—it’s still common practice. It could be out of convenience (for either the illustrator or the reader), or habit, or not knowing. If you choose to use text in your graphics, you should plan to keep the editable unflattened source image (the raw Photoshop, Illustrator, or other image application source file), all fonts used, and keep a text/Word/Excel file with it that contains all of the text used in the graphic. This way the translator has everything needed to produce the translation. This will take some time to do, depending on how intricate your images are, but will take much less time than trying to hack translated text into a flattened JPG.

Cultural appropriateness of imagery is something that is less often considered. Sometimes the most innocent of images can come across as offensive in other cultures. Little things like gestures, fine details, and colors can make the difference between a non-issue and a dead stop when preparing content for varied cultural markets. The time spent researching and proofing for these issues before sending the files for translation pales in comparison to the time and effort spent searching for a replacement at the 11th hour, particularly when the replacement influences design changes.

5. Translating deliverable files

Often, we use content either from a variety of sources or from a greater pool of information when creating documentation. When readying for translation, it’s common to send out for translation only the documents that are being delivered in English. This can complicate the translation process. Specifications used for products shipped to one region may not be compatible with the needs of another region. Wording or specific types of information might also not be appropriate in another region. Finally, any changes influenced by related products—which may or may not have been available in other regions prior—may also impact translation.

It is important to consider translation at the source level to avoid these risks and ensure that your information is ready to be deployed anywhere in any language at any time. This will decrease any rework from last minute issues found by translators, allow you to build consistency into your wording choices up front, and help you realize a greater return from translation memory leverage over time.

Do you have other suggestions for producing localization-friendly files? Please share them in the comments!

And one last note about those “Frankenfiles”… Beware!