Scriptorium Publishing

content strategy consulting

The commodity trap

October 13, 2015 by

In a recent post on lean content strategy, I wrote about a focus on waste reduction:

After creating a nice automated XML-based process, waste in formatting is eliminated, and we declare victory and go home. Unfortunately, the organization is now producing irrelevant content faster, and the content organization is now positioned as only a cost center.

Is your content perceived as a commodity?

Given a critical mass of content, a move away from desktop publishing (DTP) and into XML publishing offers compelling benefits—faster publishing, more efficient translation, efficient reuse, and so on. (Try the XML business case calculator to see whether it makes sense for your situation.)

Over the past decade or so, many organizations have moved into XML. For the most part, they have implemented what we might call XML Workflow Version 1, which has the following characteristics:

  • Focus on automation, especially in translation, as the justification for the change.
  • Refactoring content to improve consistency, which improves efficiency for authors and translators.
  • Reducing formatting edge cases that are difficult to automate.

All of these improvements are useful and necessary, but they focus on how information is encoded. Many organizations are now experiencing pricing pressure from management. Because the content creators have shown that they could be more efficient, management assumes that there must be more efficiency gains available.

Because the justification for XML Workflow Version 1 positioned content as a commodity, management now assumes that content is a commodity.

If you are in the commodity trap, you will experience the following:

  • Pressure to lower content creator costs via staff reductions, outsourcing, and so on
  • A lack of interest in content initiatives other than cost reduction
  • A flat or declining budget
  • A focus on lowest-cost suppliers across all aspects of content and localization and on commodity metrics, such as price per word
  • No budget for staff development

So how do you avoid the commodity trap?

First, it is a fact that XML Workflow Version 1 is mostly about efficiency—and many content groups need to be more efficient. When negotiating for a shift to XML, however, make sure that your argument includes XML Workflow Version 2, in which you can begin to use XML in more sophisticated ways. For instance:

  • Integrate XML-based content with information derived from business systems (such as SAP)
  • Deliver content to other business systems (such as software source code) in a compatible format to provide for better integration and collaboration across the organization
  • Improve the semantics of content (for example, embed an ISBN number with a book reference or a part number with a part reference) and provide for useful cross-linking
  • Focus on touchpoints in the customer journey and how to deliver information that supports the journey
  • Improve the localization and globalization process to deliver information that meshes with each locale, rather than just a somewhat awkward translation

Efficiency in content creators is a means to an end. By freeing content creators from formatting responsibilities and from copying/pasting/verifying repeated information, you can make them available for more high-value tasks. Avoid the commodity trap by ensuring that your content vision goes beyond efficiency and automation.

Unsung heroes of DITA (premium)

October 6, 2015 by

For some content developers—especially those using DITA for the first time—any features of DITA that go beyond the basics can seem intimidating. But sometimes, a special feature of DITA might be exactly what an organization needs to solve one of its content problems and save money. Features like conref push, subject scheme, and the learning and training specialization could play a powerful role in your content strategy—and they’re not as difficult to use as you might think.

Conref push

conref push
Conref push has the power to:

  • Push content into a topic. With conrefs, you can pull content into a topic by reference. However, with conref push, you can insert or “push” content from a source topic into another topic. That way, when you open your published topic, you can actually see the content from the source topic. You might use conref push to add a step to a task or replace an item in an unordered list.
  • Push content into a map. Conref push isn’t limited to adding or replacing elements in a topic—you can also use it to insert content into a map. This might include adding or replacing a topicref, changing a navtitle, or updating the map metadata. Conref push would be especially useful for organizations that have maps updated by people in multiple departments.
  • Facilitate ultimate reuse! If you have reusable content, you can store it in a source topic and use conref push to insert it into the relevant destination topics. With conref push, you can modify topics by adding or replacing elements in a way that the original author never conceived.

Conref push works by looking for IDs in your destination topic specifying where the pushed content should go:

<task id="contentstrategy">
<title>Developing and implementing a content strategy</title>
      <step id="establish-goals"><cmd>Establish implementation goals and metrics.</cmd></step>

Then, in your source topic, conref push uses the conaction attribute to replace an existing element:

<step conref="contentstrategy.dita#contentstrategy/establish-goals" conaction="pushreplace"><cmd>Define implementation goals and metrics.</cmd></step>

Conref push also allows you to push content before or after an existing element. If you use the conaction attribute with a value of pushbefore or pushafter, you must do so in conjuction with another conaction attribute with a value of mark to specify the location of the pushed content:

<step conaction="pushbefore"><cmd>Identify and interview stakeholders.</cmd></step>
<step conref="contentstrategy.dita#contentstrategy/establish-goals" conaction="mark"><cmd/></step>

<step conref="contentstrategy.dita#contentstrategy/establish-goals" conaction="mark"><cmd/></step>
<step conaction="pushafter"><cmd>Define roles and responsibilities.</cmd></step>

Once you’ve set up your IDs in your destination topics and your conaction attributes in your source topic, you’ll need to add the source topic to the other topics via a map to see conref push in action.

If you need to reuse content across multiple departments in your organization, conref push is the perfect hero for the job. Suppose you have two teams—tech comm and training—that share a lot of instructional content. The training team needs to use the instructions in the tech comm team’s topics, but they also need to add some information that is specific to the training department.

The tech comm team doesn’t want to be saddled with adding training-specific content to their topics (or setting up the conditions that would require). With conref push, the training team can add their own content to the topics instead—problem solved!

Subject scheme

subject scheme
Subject scheme has the power to:

  • Define custom controlled values in a map. Much like a standard DITA map uses topicrefs to define a collection of topics, a subject scheme map uses key definitions to define a collection of controlled values. This means you can create a taxonomy of custom values without having to write a specialization. To use a subject scheme map, you must reference it in the highest-level DITA map that needs to use the controlled values within it.
  • Manage relationships between controlled values. A subject scheme map sets up a hierarchy of controlled values and allows you to divide these values into categories. You can also bind the custom controlled values you create to metadata attributes.
  • Build the framework for ultimate faceted search! A subject scheme map allows you to classify large amounts of information. The values you define can be used as facets to set up a controlled, sophisticated search of your content. To take advantage of this power, you’ll need content management and authoring tools that support faceted search.

A subject scheme map defines controlled values, or keywords that identify metadata attributes, using the subjectdef element. The highest-level subjectdef elements define categories of metadata, and their child subjectdef elements define the values. By adding further levels of child subjectdef elements, you can divide these values into sub-categories:

  <subjectdef keys="vehicles">
    <subjectdef keys="car"/>
    <subjectdef keys="motorcycle"/>
    <subjectdef keys="boat">
      <subjectdef keys="racing"/>
      <subjectdef keys="fishing"/>

Once you’ve defined your custom controlled values, you can use the enumerationdef element to bind them to metadata attributes:

  <subjectdef keys="vehicles">
  <attributedef name="audience"/>
  <subjectdef keyref="vehicles"/>

With a subject scheme map, you don’t have to store attribute values in each content contributor’s DITA editor. As long as the DITA editor understands the subject scheme, the attribute values will be available to all who edit the maps or topics.

Your company might benefit from using a subject scheme map if you distribute a large number of products and need a better way to categorize them so that they are easier to find. For example, if your company sells engines, your customers should be able to search for the documentation on the engine they need according to criteria such as the relevant type of vehicle (car, motorcycle, boat), release version (1.0, 1.1, 2.0), and locations sold (United States, India, Japan).

By defining this information in a subject scheme, your customers will be able to find your content in a more targeted way than they can using a table of contents, an index, or full text search—the usual methods available with basic DITA.

Learning and training

learning and training
The learning and training specialization has the power to:

  • Structure a curriculum using specialized DITA. Just as standard DITA can be used to structure technical content in a manual with sections, the learning and training specialization can structure learning content in a course with lessons. Each lesson (or the whole course) can have a plan and an overview before it, and a summary and an assessment after it. Learning content topics can also contain standard concept, task, and reference topics.
  • Create and manage tests and quizzes. The learning and training specialization includes the learning assessment topic type, which contains elements for different types of test questions. In a learning assessment topic, you can store a question, its answer, the number of points it’s worth, and some feedback for the user. This puts the test and the answer key in a single source, which makes it easier to update the assessment material.
  • Develop amazing e-learning! With the learning and training specialization, you can instruct students over the web with interactive courses and grade their assessments automatically. (See for an example!)

With the learning and training specialization, you can structure a course to follow this ideal framework:


But what happens when you need learning content in both the virtual and the physical classroom? Because the learning and training specialization was designed for e-learning, it only provides types of assessment questions that could be answered electronically—such as true/false, matching, multiple choice, sequencing, and open-ended questions—by default.

However, the learning and training specialization can be further specialized and customized to suit your needs. For example, you could specialize the open question type to include blank lines for a student’s handwritten answer. You could also use specialization to create new question types intended for use in a physical classroom, such as filling in a blank or circling a word or phrase.



Conref push, subject scheme, and the learning and training specialization are all part of DITA 1.2, which means that they can be used together. You might…

  • use conref push to keep your subject scheme map up-to-date,
  • reuse content between lessons or tests with the help of conref push, or
  • keep track of controlled values related to an e-learning course using a subject scheme.

You might even solve your content problems by using all three. Your strategy doesn’t have to be limited to just one of these features—if you have a strong business case for it, feel free to call in the whole team!

These unsung heroes of DITA may not have a place in the spotlight (yet), but the more they’re used, the more they’ll catch on. As DITA 1.3 becomes more established and more widely supported, the features introduced with DITA 1.3 will become the new unsung heroes, which will give features from DITA 1.2 such as conref push, subject scheme, and the learning and training specialization the chance to become leaders of a bigger and better team.

If you’re having issues with your content and you think these heroes could help, don’t be afraid to call them in—they’re not as intimidating as they look, and they just might be able to save the day!
unsung heroes

Lean content strategy

September 28, 2015 by

Lean manufacturing begat lean software development which in turn begat lean content strategy.

What does lean content strategy look like?

Here are the seven key principles of lean software development.

  1. Eliminate waste
  2. Build quality in
  3. Create knowledge
  4. Defer commitment
  5. Deliver fast
  6. Respect people
  7. Optimize the whole

How well do they map over to content strategy?

1. Eliminate waste

Waste bin labeled TEAM WASTE

Eliminate waste // flickr: jeffdjevdet

Interestingly, many content strategy efforts focus only on eliminating waste.

Here are some common types of waste in content:

  • Waste in formatting (formatting and reformatting and re-reformatting)
  • Waste in information development (end users do not want or need what’s being produced)
  • Waste in delivery—information cannot be used by end user because it’s not in the right language or the right format
  • Waste in review—oh, so much waste in the review cycles

Too often, strategy projects end with waste reduction. After creating a nice automated XML-based process, waste in formatting is eliminated, and we declare victory and go home. Unfortunately, the organization is now producing irrelevant content faster, and the content organization is now positioned as only a cost center. Typically, the next step is that executive management demands additional, ongoing cost reductions rather than looking at possible quality improvements. Eliminating waste cannot be the only priority. (I expanded on this theme in The commodity trap.)

Ellis Pratt has a great lightning talk overview of types of waste in lean content strategy. I believe that he is the first person to combine the concept of lean manufacturing/lean software development with content strategy.

2. Build quality in

How do you measure quality in content? “I know it when I see it” is really not a good answer. Some content quality factors include:

  • Writing quality—mechanics and grammar
  • Usability—the ease of access to information
  • Technical accuracy
  • Completeness
  • Conciseness

All of which Scriptorium notoriously put together into the QUACK quality model.

Building quality in means that the process of creating content supports a high-quality end result. Accountability in content reviews is one technique; content validation to ensure it conforms with required structures another. Software authoring assistance can help with writing quality.

The process of creating and managing content should assist the content creator in producing high-quality information.

3. Create knowledge

The fundamental purpose of content is of course to create and disseminate knowledge. As an aspect of lean content strategy, we can identify several groups that need knowledge:

  • End users need information to use products successfully.
  • Content creators need to accumulate domain knowledge, process knowledge, and tools knowledge to become better at their jobs.
  • The user community needs a way to share knowledge.

Any content strategy must include ways to support knowledge creation inside and outside the organization.

4. Defer commitment

Our basic process for content strategy is to first identify key business requirements, and then build out an appropriate solution. The temptation, however, is to make critical decisions first, especially in tool and technology selection. Defer commitment means that you should:

  • Store content in a flexible format that allows for multiple types of output.
  • Keep your options open on deliverable formats.
  • Be open to adding new content based on user feedback or other new information.
  • Assess localization requirements regularly as business conditions change. Look at a list of supported languages as an evolving set, not as set in stone forever.

Also identify areas where commitment is required. If your content needs to meet specific regulatory requirements, these requirements change very slowly. Don’t defer a commitment to a legal requirement.

5. Deliver fast

This is true across the entire effort: content creation, management, review, delivery, and governance. Reexamine those six-month production cycles and lengthy review cycles, and find ways to shorten them.

Keep up with new products and new output requirements. Don’t let the market pass you by.

6. Respect people

Lots to think about in this area, but here are some basics:

  • Content creators: Respect their hard-won product and domain expertise.
  • End user: Respect the end user’s time and provide efficient ways to get information. Do not insult end users with useless information, like “In the Name field, type your name.”
  • Reviewer: Respect their limited time and help to focus reviews on adding value.

7. Optimize the whole

Optimizing inside a content team will only take you so far. The content team must reach into other parts of the organization, where they can:

  • Identify the origin of information and use it. For example, if product specifications are stored in a product database, then product datasheets should pull information directly from the database. Here’s what they should not do: Export from the product database to an Excel file, send the Excel file via email to the content creator, have the content creator copy and paste from the Excel file to the product data sheet file.
  • Identify content reuse across the organization and eliminate redundant copies.
  • Understand silos and why they occur. Find ways to eliminate or align silos.
  • Reduce the number of content processes in the organization.


Lean content strategy. What do you think?

Roles and responsibilities in XML publishing

September 14, 2015 by

The roles and responsibilities in an XML (and/or DITA) environment are a little different than in a traditional page layout environment. Figuring out where to move people is a key part of your implementation strategy.

Flamenco dancers and singer on a dark stage
In an unstructured (desktop publishing) workflow, content creators need a variety of skills. The three most important are:

  1. Domain knowledge (expertise about the product being documented)
  2. Writing ability (duh)
  3. Knowledge of the template and formatting expertise in the tool being used

For a structured workflow, the first two stay the same, but paragraph and character styles are replaced by elements. Formatting expertise is less critical—the formatting is embedded in a stylesheet, which is applied to content when it is time to create output. Knowledge of copyfitting and production tricks is no longer relevant and can even be detrimental if the content creator insists on trying to control the output by overriding default settings.

The content creator needs less template and formatting expertise, especially if the content is highly structured and provides guidance on what goes where. Generally, content creators need to focus more on how to organize their information and less on how to format it.

The role of the technical editor (assuming you are lucky enough to have one) also changes. Document structure is enforced by the software, so technical editors can focus on overall organization, word choice, and grammar. Technical editors are often responsible for reviewing large amounts of content. This perspective can be helpful in establishing an information architecture.

Speaking of information, we have the information architect, who is responsible for determining how information should be organized and tagged. Typical tasks for the information architect are:

  • Developing guidelines for topic-based authoring (for example, how big should a topic be?).
  • Establishing rules for tagging. For example, when should an author use the <cite> tag and when the <i> tag?
  • Organizing shared content and establishing guidelines for reuse.

The equivalent responsibilities were typically handled by the technical editor and the production editor in an unstructured workflow.

In an unstructured workflow, production editors are responsible for finalizing the layout/composition of unstructured content. They typically have deep expertise in the publishing tool and know all of the tricks to make output look good. Very often, production editors are permitted to override templates to copyfit pages and make the final result look better.

The role of the stylesheet programmer is new in an XML workflow and replaces the production editor. The stylesheet programmer creates a script that transforms XML directly into output (such as PDF or HTML). In effect, the handiwork of the production editor is replaced by a script. Stylesheet programmers need a thorough understanding of XML and especially of publishing scripts, such as XSLT, but they need almost no domain knowledge.

Here are the typical roles in a structured workflow:

Role Tags Publishing Domain
Content creator User User Advanced
Information architect Expert User Basic
Stylesheet programmer User Expert Basic
Reviewer None None Expert

Did we miss any? What do you think?

Portions excerpted from our Structured authoring and XML white paper.

Plumber’s guide to content workflows

September 8, 2015 by

Last week I was working in my home office when I heard an odd hissing sound. Upon investigation, I found that my hot water heater had decided to empty itself onto the basement floor.

Fortunately I had some failsafes in place; the heater’s pressure release valve was doing its job by routing scalding hot water onto the floor, and my floor is slightly slanted toward a drain in the floor. This got me thinking (because my brain is oddly wired this way) about failsafes in content workflows.

I spent a good deal of time in panic mode, frantically moving boxes out of the way of the growing puddle as the water made its way toward the drain. I soon managed to turn off both the water supply and the gas, but not before losing many gallons of water to the floor, and was left to enjoy a makeshift sauna while the release valve slowly began to close.


image: Pixabay/ClkerFreeVectorImages

Enter the plumber (aka water tank strategist).

After looking my tank over and testing the valve, the prognosis was not good. There was a buildup of mineral deposits within the tank that were gumming up the works (I think that’s an accurate technical description). They could clean or replace the valve, but it would just happen again, and without warning.

Long story short, I have a new water heater.

I couldn’t help but tie this mess to situations where impurities in content workflows cause their own brand of chaos. Where my tank failed due to mineral deposits, many content workflows can fail or produce poor content deliverables due to deposits of a different nature.

Content workflow impurities

Whether structured or unstructured, unchecked workflows and workarounds can cause problems over time, and can be costly to correct. One practice in particular can completely pollute your content: copy/paste. Even when the content is completely accurate, copying and pasting content creates multiple stand-alone instances of the same information. Updating that content everywhere it’s used becomes very tedious and time consuming, with considerable risk of missing some instances.

A lack of proper review and approval can also cause problems over time. As misinformation and grammatical errors are introduced, they can have a negative impact on readers’ trust in the information and in your company. A single-source publishing solution can compound this problem.

Finally, not having a failsafe in place to catch these issues can lead to catastrophe. You may lose customers or reach a critical mass where your content is no longer manageable. At that point your only option is a complete workflow replacement. As with my water heater, this is both inconvenient and expensive, and will require additional frantic busy-work and workarounds in the interim to prevent the situation from getting worse.

The best way to fix a problem is to avoid it in the first place. Assess your workflows to see if you are at risk for unwanted deposits in your content. As for failsafes, make sure you have proper reviews in place and proper reuse strategies. Also, have the means necessary (expedited updates, ability to “unpublish” bad content) to quickly address issues before they build up and cause an even bigger problem.

Tech comm skills: writing ability, technical aptitude, tool proficiency, and business sense

August 17, 2015 by

Technical Writing is only about what software you know! Is that why every where I read any type of document, web page, or article it is FULL of misspellings, incorrect punctuation, and horrible formatting?!!

That’s what started a thread on LinkedIn that encapsulates long-running debates on the skill sets technical writers need. (The thread was removed from LinkedIn sometime after Friday, unfortunately.)

From my point of view, a good technical communicator possesses a balance of writing ability, technical aptitude, and software skills. Problems arise when that mix of skills is off-kilter:

  • Grammatically pristine content that just scratches the surface of a product reflects a lack of technical understanding and reduces tech comm to stenography.
  • Overly technical content that catalogs every feature of a product demonstrates technical depth but no writing ability. Such content is usually badly organized (writing about every menu choice in order is not good organization) and littered with grammatical and spelling mistakes.
  • Proficiency in the tools for creating content means information development is more efficient, but blind devotion to a tool is a big (and unprofessional) mistake.

A lot of commenters in the thread touch on these aspects, but at the time I wrote this post, there was a glaring omission among the discussed skill sets:  an understanding of business.

Business requirements should drive all content-related efforts at a company, so it’s vital that content creators—technical writers included—understand how their content supports company goals (or not, as the case may be). Changes to content (new tools, new publishing formats, and so on) must be carefully vetted to determine whether there is a solid business case to make such changes. For example, you propose implementing an XML-based workflow because you have numbers showing cost savings. “Other companies are doing it” and “a software vendor told me we need it” are not business cases.

Writing ability, technical aptitude, and dexterity with software are important skills for technical writers to have. But understanding how your efforts connect to the company’s business requirements is what gives you the edge in making your tech comm work indispensable.

Design versus automation: a strategic approach to content

August 10, 2015 by

Design and automation are often positioned as mutually exclusive–you have to choose one or the other. But in fact, it’s possible to deliver content in an automated workflow that uses a stellar design. To succeed, you need a designer who can work with styles, templates, and other building blocks instead of ad hoc formatting.

More content across more devices requires scalability–and that means more automation. A strategic approach to content needs to incorporate both design and automation as constraints and find the right balance between the two.


First, a few definitions.

Design–specifically graphic design–is the process of deciding how information is presented to the person who needs it. The design effort may include text, graphics, sound, video, tactile feedback, and more, along with the interaction among the various types of information delivery. In addition to the content itself, design also includes navigational elements, such as page numbers, headers and footers, breadcrumbs, and audio or video “bumpers” (to indicate the start or end of a segment). Sometimes, the designer knows the final delivery device, such a kiosk in a train station or a huge video board in a sports stadium. In other cases, the delivery is controlled by the end user–their phone, desktop browser, or screen reader.

Automation is a workflow in which information is translated from its raw markup to a final packaged presentation without human intervention.


Design and automation are not mutually exclusive.


Instead, think of design and automation as different facets of your content. Each quadrant of the design-automation relationship results in different types of documents. High design and low automation is where you find coffee table books. High automation and low design encompasses invoices, bad WordPress sites, and 30 percent of the world’s data sheets. Low design/low automation is just crap–web sites hand-coded by people with no skills, anything written in Comic Sans, and the other 70 percent of data sheets. (Seriously, where did people get the idea that using InDesign without any sort of styles was appropriate for a collection of technical documents? But I digress…)

The interesting quadrant is the last one: high design and high automation. In this area, you find larger web sites, most fiction books, and, increasingly, marketing content (moving out of lovingly handcrafted as automation increases) and technical content (moving out of “ugly templates” and up the design scale).

Design/automation on X/Y coordinates. Automation is the X axis; design is the Y axis.

Design and automation and different facets of content creation.

The world of structured content inhabits a narrow slice on the extreme right.

Automation on the X axis; design on the Y axis. Structured content is a band in the high-automation area.

Structured content goes with high automation.

Design gets a similar swath of the top.

Automation on the X axis; design on the Y axis. Design-centered content is a band in the high-design area.

Design-centered content at the top of the design region.

When you combine a requirement for high design with a requirement for high automation, you get The Region of Doom.

Same grid at before. The region of doom is the top left corner, where you have extreme design and extreme automation requirements.

You can accommodate 90% design and 100% automation or 90% automation and 100% design, but if you are unwilling to compromise on either axis, expect to spend a lot of money.

A better strategy is to focus on the 90% area. By eliminating 5–10% of the most challenging design requirements, or by allowing for a small amount of post-processing after automated production, you can get an excellent result at a much lower cost that what the Region of Doom requires.

The intersection of design and automation bars in the upper right is the best value.

Small compromises in design and/or automation result in big cost savings.


When we discuss design versus automation, we are really arguing about when to implement a particular design. An automated workflow requires a designer to plan the look and feel of the document and provide templates for the documents. The publishing process is then a matter of applying the predefined design to new content.

The traditional design process ingests content and then people apply design elements to it manually.

In other words, automation requires design first, and this approach disrupts the traditional approach to design. Like any disruptive innovation, this new approach is inferior at first to the “old way” (hand-crafting design). As the technology improves, it takes over more and more use cases.

Disruptive technology first takes over the low end of the market, and then gradually moves up to more demanding users.

Evolution of disruptive technology over time, public domain image found at Wikimedia

Travis Gertz writes an impassioned defense of editorial design in Design Machines: How to survive the digital apocalypse:

Editorial designers know that the secret isn’t content first or content last… it’s content and design at the same time.

[…] When we design with content the way we should, design augments the message of the content.

[…] None of these concepts would exist if designed by a content-first or content-last approach. It’s not enough. This level of conceptual interpretation requires a deep understanding of and connection to the content. A level we best achieve when we work with our editors and content creators as we design. This requirement doesn’t just apply to text. Notice how every single photo and illustration intertwines with the writing. There are no unmodified stock photos and no generic shots that could be casually slipped into other stories. Every element has a purpose and a place. A destiny in the piece it sits inside of.

He provides wonderful examples of visuals entwined with content, mostly from magazine covers. And here is the problem:

  • As Gertz acknowledges earlier in the article, for many small businesses, basic templates and easy web publishing are a step up from what they are otherwise able to do. Their choice is between a hand-coded, ugly web page (or no web presence at all), and a somewhat generic design via SquareSpace or a somewhat customized WordPress site. Magazine-level design is not an option. In other words, automation gives small business the option of moving out of the dreaded Meh quadrant.
  • What is the pinnacle of good design? Is it a design in which graphics and text form a seamless whole that is better than the individual components? Many designers forget that not everyone can handle all the components. A fun audio overlay is lost on a person who is hard of hearing. Without proper alternate text, a complex infographic or chart is not be usable by someone who relies on a screen reader.
  • The vast majority of content does not need or deserve the high-design treatment.
  • An advantage of visual monoculture is that readers know what to expect and where.
  • All of these examples are for relatively low volumes of content. I don’t think these approaches scale.


What do you think? Scriptorium builds systems that automate as much as possible, and then use additional resources as necessary for the final fit and finish. Only some limited subset of content is worth this investment. I know that I have a bias toward efficient content production so that companies can focus on better information across more languages.

For more on this topic, come see my session at Big Design Dallas in mid-September.

Making metadata in DITA work for you

August 6, 2015 by

Metadata is one of the most important factors in making the most of your DITA XML-based content environment. Whether you’re converting legacy content into DITA or creating new structured content, it’s important to know what metadata (or data about data) your files will keep track of and why. Coming up with a plan for using metadata can be tricky, so here are some tips to make the process easier.

Using metadata to track content

Metadata can be a powerful aid in organizing your content. It can help speed up your production workflow by allowing you to find all content by a certain author or all content that needs to be reviewed. It can help with distribution by filtering content according to intended audience or location. It can also make content searches easier for both internal and external users. Because metadata exists at both the topic level and the element level, it offers lots of flexibility in content filtering and search.

Before you begin converting or authoring content, think about what metadata you’ll need to track in your content and why. You may need to distinguish between different types of content (informational, instructional, legal), different audiences (internal, external), or different products. Depending on the type and volume of content you create, your metadata needs may be very specific and complex. Making a list of required metadata and how you plan to use it will make it easier to implement your new structured content workflow.

Standard metadata

Learn as much as you can about the DITA standard and what it offers when it comes to metadata. In DITA, you will have certain metadata attributes and elements available by default:

  • author
  • audience
  • product
  • and many others

To determine how well the DITA standard metadata supports your needs, compare it with your metadata requirements and see if you can find an existing attribute or element that matches each item on your list. You may have some requirements with close but not exact matches, or others that are too specific to your company for a match. In that case, you may want to consider metadata specialization.


Specializing metadata attributes and elements can help you organize and filter your content in ways that are tailored to your company’s unique needs. You may need to capture much more information about your product than the DITA standard metadata allows. You may also need to filter content in a hierarchical or multi-faceted way – for example, distributing content to certain locations, and within those locations, only to employees with certain job titles. DITA allows you to specialize metadata elements to include multiple values, which makes this kind of filtering possible.

Although specialization can be highly useful, it can also present challenges to implementation. If you do specialize, be sure to choose a content management system and other tools that can support your changes to the standard metadata attributes and elements. It’s always better to stick to the standard and only specialize if you absolutely must—that way, you’ll be able to choose from a wider selection of tools that will support your content.

Content management

A component content management system (or CCMS) can use metadata to enhance your content development workflow. A CCMS may be equipped to filter on metadata, which helps authors and reviewers find the content they need and tells the publication engines what output to produce. It may also connect with an authoring tool to populate certain metadata values automatically, such as author or status, when a content creator logs into the system.

When you evaluate CCMS options, you should not only ask whether each CCMS you’re considering supports your metadata needs, but also how it manages metadata. How does the CCMS use the metadata in your content to help with workflow? How flexible is the CCMS if you start with standard metadata now but need to specialize it later? What happens to metadata that is created and managed by the CMS if you ever need to move your content into a different system? Having a solid plan in place for metadata use before you choose a CCMS and other tools will help you ask the right questions and make the best decision.

Structured authoring: breaking the WYSIWYG habit

July 28, 2015 by

There are many challenges involved when moving to structured authoring (XML), but perhaps the most personal challenge is breaking out of the WYSIWYG (what you see is what you get) authoring mode.

The rise of desktop publishing has merged the roles of writer, editor, designer, and publisher into one. With many modern authoring tools, what you see in the authoring environment and what you get as a finished product is nearly identical. It’s easy to settle into a WYSIWYG mindset, as there’s comfort in knowing what the content will look like. However, that trust can be misplaced.

WYSIWYG authoring works best under one condition: producing one type of document with a very specific design. Once you add another format into the scenario, the model begins to fall apart. If you’re supporting two or more delivery formats, you end up designing for one and hoping that it looks good in the others. Juggling several different formats, perhaps with content reuse between them, can quickly become tedious.

A move to XML involves removing the physical formatting of content from the words themselves. Instead of manually formatting text as “Times 18pt Bold” or applying a specific “Heading1″ style to text, you encapsulate the text within a tag (<title>, for example). This tag can then be rendered however you prefer for any output format, at any heading level. This has many advantages in multi-output scenarios, but because the formatting is detached from the content itself, it can lead to frustration among visual authors.

Some XML authoring tools provide both a text view and a visual markup mode as you write. The markup mode does provide some level of visual formatting, but it’s anything but WYSIWYG. Be careful not to mistake the visual authoring mode as an indication of what the final output will look like. It’s best to divorce yourself of the WYSIWYG mindset when transitioning to XML authoring.

Here are some tips that can help you break bad WYSIWYG authoring habits.

Remind yourself of the big picture

Always remember that the content you are writing could be used anywhere. It could be in a printed book or PDF, or on a web page, or on a phone. Because these are very different delivery mediums, you cannot visually format text for all of them at once (at least not well).

Instead, trust not in what you see but in your publishing process. The transformations that produce your finished products from the XML will format the content appropriately, provided the XML has been correctly tagged and structured. In the case of the <title> example mentioned earlier, it can be formatted in a different way in each target output—using different fonts, weights, colors, or other stylistic treatments.

Try working in text mode

Switching from WYSIWYG authoring to text-based authoring (think Notepad) can be a very difficult transition for some people (if not downright scary), but there is value in seeing and understanding what is going on “under the hood” with your content. Seeing that something looks bold doesn’t necessarily mean it’s arbitrarily formatted that way. It could mean that it’s a UI label, or a special term, or some other specific type of content. As an author, it’s very important to know the difference and to use the correct tag.

When you first begin working with XML, take some time to try writing in both interfaces (text mode and visual mode). Switch back and forth, applying tags in the visual editor for a bit, and then hand-typing them in text mode. Do this until you become comfortable with the correlation of what you see in the visual editor and the tags (in text mode) that they represent. This will not only break you of the WYSIWYG authoring mindset; it will prepare you for any structural troubleshooting you may need to perform on content down the road.

Get uncomfortable

Developing content, like exercise, is habit-forming. The more you do it, the easier it gets, but the more you conform to a particular technique. In the case of exercise, it could mean that you begin to overstimulate some muscles and underutilize others, despite enjoying the activity and its benefits. When developing content, the more you approach it using the same tools in the same manner, the easier it is to form habits along the way. These habits may get the job done well, and you may enjoy the way you work, but WYSIWYG habits won’t transition well into structured authoring.

yoga pose

image: Pixabay//SimonaR

If you’re coming from a WYSIWYG background, you may have some habits to break. While the tips mentioned earlier will help, the key is to allow yourself to get uncomfortable and stop authoring the way you used to.

XML authoring is not at all like WYSIWYG authoring, and no authoring UI will change that. As close to WYSIWYG as some visual authoring tools may get, the underlying content is still XML, and it needs to be semantically and structurally correct.

If you find yourself struggling to change, push yourself further. Work only in text mode for a while, or switch between text and visual authoring modes more frequently. Just as it is difficult to train a different set of muscles when exercising, making an authoring transition can be difficult. But with time and dedication, you’ll reap the benefits.

Localization: are you the weakest link?

July 13, 2015 by

There is an old proverb that says, “a chain is only as strong as its weakest link.” While many of the links in the chain could be quite strong, it only takes one weak link to break the chain. There is one process chain in particular where this proverb rings true: localization. However, more often than not, little or nothing is done to identify and strengthen the weakest link in that process.

When it comes to localization, there’s a common misperception that if there is a problem with the translation then the vendor is to blame. In some cases this may be true, but in every localization effort (no matter how large or small) there is a shared responsibility between content authors and translators to ensure that the translation is accurate and is completed both on time and within budget.

localization chain

If any link in the localization chain breaks, the chain itself fails. Half of these links are owned by content development.

If we look at every link in the localization chain, we can start to see where the possible failure points may be.

What content creators control

Style guide: A corporate style guide defines how all content should be developed and presented, from writing style to publishing standards. If the rules in your style guide are ambiguous (or worse, if you don’t have one at all), your entire content development effort is left to chance. A good style guide should inform both the source language content development effort as well as the localization effort. In short, it’s the single source of truth for developing all of your content, in all languages, for all audiences.

Terminology: Just as a style guide informs how to develop content, a terminology reference (repository, term bank, master glossary, etc.) informs which words are correct to use, in which contexts they can be used, what they mean, and (perhaps most important) which words never to use in their place. A good terminology reference can prevent unintentional misinformation, and can and should be localized on its own so all terms are properly defined for each target language. Leaving terminology to chance can lead to perfectly cromulent results.

Quality of writing: Writing quality plays a very large role in translation quality. If the wording is unclear, or if the same concepts are written in different ways, then translators may misinterpret how to translate the content. Consistency in writing is critical; a formal editorial review can enforce consistency. Even with a style guide in place, it’s easy for writers to become out of sync without an editorial review.

Consistent formatting: Regardless of what tools you use to author your content, consistent formatting can significantly reduce translation turnaround time and cost. If your tools use templates, adhere to them 100% of the time, and provide the templates to your localization vendors (they should be reviewed for localization appropriateness and modified as needed for target languages). Whoever formats the translated content can then apply the localized template and avoid hand-formatting, provided authors haven’t used formatting overrides.

What localization vendors control

Translation workflow: The translation vendor should have a system in place to manage the overall translation effort. These systems track which translators are being used, which content they are responsible for, and all of the checks and balances between starting and finishing the translation work. Lack of such a system, or a breakdown in this communication chain, can cause deadlines to slip and costs to rise.

Translator qualifications: Translating content requires more than just a strong command of the source and target languages. A good translator must also be familiar with the subject matter, and be very familiar with the regional variations of the target language. The translation vendor should select the appropriate translators for the job and consistently evaluate their work.

Translation memory: It goes without saying that a good translation vendor uses translation memory (TM) in their workflow. Leveraging the TM can ensure consistency in translation and reduce overall translation costs and turnaround times, provided the source content is also consistent. If inconsistencies are found, the translation vendor should work with the content creators to correct it so the inconsistency isn’t added to the TM. Further, the TM itself should be audited on a regular basis to clean out any inconsistencies that may have crept in over time.

In-country review: An in-country (or local) review of translations is critical for tailoring translated content to the target audience. Language varies by location (would you like a pop/tonic/soda/coke to wash down that sandwich/hoagie/grinder/sub?), and the local review ensures that the translation is appropriate for the target audience. Failure to conduct these reviews can result in misunderstandings between the audience and the content, causing a scramble to correct the issues and a negative perception of your company in that location.

Any one of these links could be a point of failure in your localization chain. By identifying and strengthening that link, you will not only improve the quality of your localized content, but will likely save time and money in the process. Evaluate your localization chain often; strengthening one link may help identify another than also needs attention.