Scriptorium Publishing

content strategy consulting

Unsung heroes of DITA (premium)

October 6, 2015 by

For some content developers—especially those using DITA for the first time—any features of DITA that go beyond the basics can seem intimidating. But sometimes, a special feature of DITA might be exactly what an organization needs to solve one of its content problems and save money. Features like conref push, subject scheme, and the learning and training specialization could play a powerful role in your content strategy—and they’re not as difficult to use as you might think.

Conref push

conref push
Conref push has the power to:

  • Push content into a topic. With conrefs, you can pull content into a topic by reference. However, with conref push, you can insert or “push” content from a source topic into another topic. That way, when you open your published topic, you can actually see the content from the source topic. You might use conref push to add a step to a task or replace an item in an unordered list.
  • Push content into a map. Conref push isn’t limited to adding or replacing elements in a topic—you can also use it to insert content into a map. This might include adding or replacing a topicref, changing a navtitle, or updating the map metadata. Conref push would be especially useful for organizations that have maps updated by people in multiple departments.
  • Facilitate ultimate reuse! If you have reusable content, you can store it in a source topic and use conref push to insert it into the relevant destination topics. With conref push, you can modify topics by adding or replacing elements in a way that the original author never conceived.

Conref push works by looking for IDs in your destination topic specifying where the pushed content should go:

<task id="contentstrategy">
<title>Developing and implementing a content strategy</title>
      <step id="establish-goals"><cmd>Establish implementation goals and metrics.</cmd></step>

Then, in your source topic, conref push uses the conaction attribute to replace an existing element:

<step conref="contentstrategy.dita#contentstrategy/establish-goals" conaction="pushreplace"><cmd>Define implementation goals and metrics.</cmd></step>

Conref push also allows you to push content before or after an existing element. If you use the conaction attribute with a value of pushbefore or pushafter, you must do so in conjuction with another conaction attribute with a value of mark to specify the location of the pushed content:

<step conaction="pushbefore"><cmd>Identify and interview stakeholders.</cmd></step>
<step conref="contentstrategy.dita#contentstrategy/establish-goals" conaction="mark"><cmd/></step>

<step conref="contentstrategy.dita#contentstrategy/establish-goals" conaction="mark"><cmd/></step>
<step conaction="pushafter"><cmd>Define roles and responsibilities.</cmd></step>

Once you’ve set up your IDs in your destination topics and your conaction attributes in your source topic, you’ll need to add the source topic to the other topics via a map to see conref push in action.

If you need to reuse content across multiple departments in your organization, conref push is the perfect hero for the job. Suppose you have two teams—tech comm and training—that share a lot of instructional content. The training team needs to use the instructions in the tech comm team’s topics, but they also need to add some information that is specific to the training department.

The tech comm team doesn’t want to be saddled with adding training-specific content to their topics (or setting up the conditions that would require). With conref push, the training team can add their own content to the topics instead—problem solved!

Subject scheme

subject scheme
Subject scheme has the power to:

  • Define custom controlled values in a map. Much like a standard DITA map uses topicrefs to define a collection of topics, a subject scheme map uses key definitions to define a collection of controlled values. This means you can create a taxonomy of custom values without having to write a specialization. To use a subject scheme map, you must reference it in the highest-level DITA map that needs to use the controlled values within it.
  • Manage relationships between controlled values. A subject scheme map sets up a hierarchy of controlled values and allows you to divide these values into categories. You can also bind the custom controlled values you create to metadata attributes.
  • Build the framework for ultimate faceted search! A subject scheme map allows you to classify large amounts of information. The values you define can be used as facets to set up a controlled, sophisticated search of your content. To take advantage of this power, you’ll need content management and authoring tools that support faceted search.

A subject scheme map defines controlled values, or keywords that identify metadata attributes, using the subjectdef element. The highest-level subjectdef elements define categories of metadata, and their child subjectdef elements define the values. By adding further levels of child subjectdef elements, you can divide these values into sub-categories:

  <subjectdef keys="vehicles">
    <subjectdef keys="car"/>
    <subjectdef keys="motorcycle"/>
    <subjectdef keys="boat">
      <subjectdef keys="racing"/>
      <subjectdef keys="fishing"/>

Once you’ve defined your custom controlled values, you can use the enumerationdef element to bind them to metadata attributes:

  <subjectdef keys="vehicles">
  <attributedef name="audience"/>
  <subjectdef keyref="vehicles"/>

With a subject scheme map, you don’t have to store attribute values in each content contributor’s DITA editor. As long as the DITA editor understands the subject scheme, the attribute values will be available to all who edit the maps or topics.

Your company might benefit from using a subject scheme map if you distribute a large number of products and need a better way to categorize them so that they are easier to find. For example, if your company sells engines, your customers should be able to search for the documentation on the engine they need according to criteria such as the relevant type of vehicle (car, motorcycle, boat), release version (1.0, 1.1, 2.0), and locations sold (United States, India, Japan).

By defining this information in a subject scheme, your customers will be able to find your content in a more targeted way than they can using a table of contents, an index, or full text search—the usual methods available with basic DITA.

Learning and training

learning and training
The learning and training specialization has the power to:

  • Structure a curriculum using specialized DITA. Just as standard DITA can be used to structure technical content in a manual with sections, the learning and training specialization can structure learning content in a course with lessons. Each lesson (or the whole course) can have a plan and an overview before it, and a summary and an assessment after it. Learning content topics can also contain standard concept, task, and reference topics.
  • Create and manage tests and quizzes. The learning and training specialization includes the learning assessment topic type, which contains elements for different types of test questions. In a learning assessment topic, you can store a question, its answer, the number of points it’s worth, and some feedback for the user. This puts the test and the answer key in a single source, which makes it easier to update the assessment material.
  • Develop amazing e-learning! With the learning and training specialization, you can instruct students over the web with interactive courses and grade their assessments automatically. (See for an example!)

With the learning and training specialization, you can structure a course to follow this ideal framework:


But what happens when you need learning content in both the virtual and the physical classroom? Because the learning and training specialization was designed for e-learning, it only provides types of assessment questions that could be answered electronically—such as true/false, matching, multiple choice, sequencing, and open-ended questions—by default.

However, the learning and training specialization can be further specialized and customized to suit your needs. For example, you could specialize the open question type to include blank lines for a student’s handwritten answer. You could also use specialization to create new question types intended for use in a physical classroom, such as filling in a blank or circling a word or phrase.



Conref push, subject scheme, and the learning and training specialization are all part of DITA 1.2, which means that they can be used together. You might…

  • use conref push to keep your subject scheme map up-to-date,
  • reuse content between lessons or tests with the help of conref push, or
  • keep track of controlled values related to an e-learning course using a subject scheme.

You might even solve your content problems by using all three. Your strategy doesn’t have to be limited to just one of these features—if you have a strong business case for it, feel free to call in the whole team!

These unsung heroes of DITA may not have a place in the spotlight (yet), but the more they’re used, the more they’ll catch on. As DITA 1.3 becomes more established and more widely supported, the features introduced with DITA 1.3 will become the new unsung heroes, which will give features from DITA 1.2 such as conref push, subject scheme, and the learning and training specialization the chance to become leaders of a bigger and better team.

If you’re having issues with your content and you think these heroes could help, don’t be afraid to call them in—they’re not as intimidating as they look, and they just might be able to save the day!
unsung heroes

Lean content strategy

September 28, 2015 by

Lean manufacturing begat lean software development which in turn begat lean content strategy.

What does lean content strategy look like?

Here are the seven key principles of lean software development.

  1. Eliminate waste
  2. Build quality in
  3. Create knowledge
  4. Defer commitment
  5. Deliver fast
  6. Respect people
  7. Optimize the whole

How well do they map over to content strategy?

1. Eliminate waste

Waste bin labeled TEAM WASTE

Eliminate waste // flickr: jeffdjevdet

Interestingly, many content strategy efforts focus only on eliminating waste.

Here are some common types of waste in content:

  • Waste in formatting (formatting and reformatting and re-reformatting)
  • Waste in information development (end users do not want or need what’s being produced)
  • Waste in delivery—information cannot be used by end user because it’s not in the right language or the right format
  • Waste in review—oh, so much waste in the review cycles

Too often, strategy projects end with waste reduction. After creating a nice automated XML-based process, waste in formatting is eliminated, and we declare victory and go home. Unfortunately, the organization is now producing irrelevant content faster, and the content organization is now positioned as only a cost center. Typically, the next step is that executive management demands additional, ongoing cost reductions rather than looking at possible quality improvements. Eliminating waste cannot be the only priority.

Ellis Pratt has a great lightning talk overview of types of waste in lean content strategy. I believe that he is the first person to combine the concept of lean manufacturing/lean software development with content strategy.

2. Build quality in

How do you measure quality in content? “I know it when I see it” is really not a good answer. Some content quality factors include:

  • Writing quality—mechanics and grammar
  • Usability—the ease of access to information
  • Technical accuracy
  • Completeness
  • Conciseness

All of which Scriptorium notoriously put together into the QUACK quality model.

Building quality in means that the process of creating content supports a high-quality end result. Accountability in content reviews is one technique; content validation to ensure it conforms with required structures another. Software authoring assistance can help with writing quality.

The process of creating and managing content should assist the content creator in producing high-quality information.

3. Create knowledge

The fundamental purpose of content is of course to create and disseminate knowledge. As an aspect of lean content strategy, we can identify several groups that need knowledge:

  • End users need information to use products successfully.
  • Content creators need to accumulate domain knowledge, process knowledge, and tools knowledge to become better at their jobs.
  • The user community needs a way to share knowledge.

Any content strategy must include ways to support knowledge creation inside and outside the organization.

4. Defer commitment

Our basic process for content strategy is to first identify key business requirements, and then build out an appropriate solution. The temptation, however, is to make critical decisions first, especially in tool and technology selection. Defer commitment means that you should:

  • Store content in a flexible format that allows for multiple types of output.
  • Keep your options open on deliverable formats.
  • Be open to adding new content based on user feedback or other new information.
  • Assess localization requirements regularly as business conditions change. Look at a list of supported languages as an evolving set, not as set in stone forever.

Also identify areas where commitment is required. If your content needs to meet specific regulatory requirements, these requirements change very slowly. Don’t defer a commitment to a legal requirement.

5. Deliver fast

This is true across the entire effort: content creation, management, review, delivery, and governance. Reexamine those six-month production cycles and lengthy review cycles, and find ways to shorten them.

Keep up with new products and new output requirements. Don’t let the market pass you by.

6. Respect people

Lots to think about in this area, but here are some basics:

  • Content creators: Respect their hard-won product and domain expertise.
  • End user: Respect the end user’s time and provide efficient ways to get information. Do not insult end users with useless information, like “In the Name field, type your name.”
  • Reviewer: Respect their limited time and help to focus reviews on adding value.

7. Optimize the whole

Optimizing inside a content team will only take you so far. The content team must reach into other parts of the organization, where they can:

  • Identify the origin of information and use it. For example, if product specifications are stored in a product database, then product datasheets should pull information directly from the database. Here’s what they should not do: Export from the product database to an Excel file, send the Excel file via email to the content creator, have the content creator copy and paste from the Excel file to the product data sheet file.
  • Identify content reuse across the organization and eliminate redundant copies.
  • Understand silos and why they occur. Find ways to eliminate or align silos.
  • Reduce the number of content processes in the organization.


Lean content strategy. What do you think?

If it’s the last quarter, this must be conference season! Our event schedule

September 23, 2015 by

We’re about to begin the last quarter of 2015, and that means CONFERENCES. Scriptorium is attending many tech comm and content strategy events.

Will we see you at these conferences?

Big Design

If you missed Sarah O’Keefe’s presentation at Big Design last week, check out her blog post on the same topic: Design versus automation: a strategic approach to content.

Information Development World
September 30–October 2
San Jose

Next week at Information Development World, Bill Swallow is presenting on Localization Planning and The Content Strategy of Things, and Gretyl Kinsey is presenting on the Unsung Heroes of DITA.

Please drop by our booth in the exhibition hall to chat with Bill and Gretyl—and to get some chocolate!

LocWorld 29
October 14–16
Silicon Valley

Bill Swallow is presenting on Content Strategy: Disrupting the Traditional LSP at LocWorld 29. He is also on a panel, Smart Products and Connected Devices Require Intelligent Localized Content.

October 18–21
New Orleans

At LavaCon, Sarah O’Keefe is presenting on Content Strategy Triage. Visit with her and me at Scriptorium’s booth, where we can talk about content strategy. And eat chocolate, of course.

November 10–12

Sarah O’Keefe is offering a tutorial on Unified Content Development: Marketing, Technical, and Support Communication at tcworld.  I’m presenting on Balancing Standardization Against the Need for Creativity.


If you’d like to schedule a meeting with us during  these conferences, please contact us—and safe travels!

Roles and responsibilities in XML publishing

September 14, 2015 by

The roles and responsibilities in an XML (and/or DITA) environment are a little different than in a traditional page layout environment. Figuring out where to move people is a key part of your implementation strategy.

Flamenco dancers and singer on a dark stage
In an unstructured (desktop publishing) workflow, content creators need a variety of skills. The three most important are:

  1. Domain knowledge (expertise about the product being documented)
  2. Writing ability (duh)
  3. Knowledge of the template and formatting expertise in the tool being used

For a structured workflow, the first two stay the same, but paragraph and character styles are replaced by elements. Formatting expertise is less critical—the formatting is embedded in a stylesheet, which is applied to content when it is time to create output. Knowledge of copyfitting and production tricks is no longer relevant and can even be detrimental if the content creator insists on trying to control the output by overriding default settings.

The content creator needs less template and formatting expertise, especially if the content is highly structured and provides guidance on what goes where. Generally, content creators need to focus more on how to organize their information and less on how to format it.

The role of the technical editor (assuming you are lucky enough to have one) also changes. Document structure is enforced by the software, so technical editors can focus on overall organization, word choice, and grammar. Technical editors are often responsible for reviewing large amounts of content. This perspective can be helpful in establishing an information architecture.

Speaking of information, we have the information architect, who is responsible for determining how information should be organized and tagged. Typical tasks for the information architect are:

  • Developing guidelines for topic-based authoring (for example, how big should a topic be?).
  • Establishing rules for tagging. For example, when should an author use the <cite> tag and when the <i> tag?
  • Organizing shared content and establishing guidelines for reuse.

The equivalent responsibilities were typically handled by the technical editor and the production editor in an unstructured workflow.

In an unstructured workflow, production editors are responsible for finalizing the layout/composition of unstructured content. They typically have deep expertise in the publishing tool and know all of the tricks to make output look good. Very often, production editors are permitted to override templates to copyfit pages and make the final result look better.

The role of the stylesheet programmer is new in an XML workflow and replaces the production editor. The stylesheet programmer creates a script that transforms XML directly into output (such as PDF or HTML). In effect, the handiwork of the production editor is replaced by a script. Stylesheet programmers need a thorough understanding of XML and especially of publishing scripts, such as XSLT, but they need almost no domain knowledge.

Here are the typical roles in a structured workflow:

Role Tags Publishing Domain
Content creator User User Advanced
Information architect Expert User Basic
Stylesheet programmer User Expert Basic
Reviewer None None Expert

Did we miss any? What do you think?

Portions excerpted from our Structured authoring and XML white paper.

Plumber’s guide to content workflows

September 8, 2015 by

Last week I was working in my home office when I heard an odd hissing sound. Upon investigation, I found that my hot water heater had decided to empty itself onto the basement floor.

Fortunately I had some failsafes in place; the heater’s pressure release valve was doing its job by routing scalding hot water onto the floor, and my floor is slightly slanted toward a drain in the floor. This got me thinking (because my brain is oddly wired this way) about failsafes in content workflows.

I spent a good deal of time in panic mode, frantically moving boxes out of the way of the growing puddle as the water made its way toward the drain. I soon managed to turn off both the water supply and the gas, but not before losing many gallons of water to the floor, and was left to enjoy a makeshift sauna while the release valve slowly began to close.


image: Pixabay/ClkerFreeVectorImages

Enter the plumber (aka water tank strategist).

After looking my tank over and testing the valve, the prognosis was not good. There was a buildup of mineral deposits within the tank that were gumming up the works (I think that’s an accurate technical description). They could clean or replace the valve, but it would just happen again, and without warning.

Long story short, I have a new water heater.

I couldn’t help but tie this mess to situations where impurities in content workflows cause their own brand of chaos. Where my tank failed due to mineral deposits, many content workflows can fail or produce poor content deliverables due to deposits of a different nature.

Content workflow impurities

Whether structured or unstructured, unchecked workflows and workarounds can cause problems over time, and can be costly to correct. One practice in particular can completely pollute your content: copy/paste. Even when the content is completely accurate, copying and pasting content creates multiple stand-alone instances of the same information. Updating that content everywhere it’s used becomes very tedious and time consuming, with considerable risk of missing some instances.

A lack of proper review and approval can also cause problems over time. As misinformation and grammatical errors are introduced, they can have a negative impact on readers’ trust in the information and in your company. A single-source publishing solution can compound this problem.

Finally, not having a failsafe in place to catch these issues can lead to catastrophe. You may lose customers or reach a critical mass where your content is no longer manageable. At that point your only option is a complete workflow replacement. As with my water heater, this is both inconvenient and expensive, and will require additional frantic busy-work and workarounds in the interim to prevent the situation from getting worse.

The best way to fix a problem is to avoid it in the first place. Assess your workflows to see if you are at risk for unwanted deposits in your content. As for failsafes, make sure you have proper reviews in place and proper reuse strategies. Also, have the means necessary (expedited updates, ability to “unpublish” bad content) to quickly address issues before they build up and cause an even bigger problem.

More free DITA training: the concept topic

August 31, 2015 by

Thanks to everyone who has signed up for and taken the free Introduction to DITA course. The introductory course offers a high-level overview of DITA.

Want a deeper dive into the DITA information types (concept, topic, reference, and glossary)? Today, we are releasing our second course on the DITA concept topic. The course and supporting videos were created by a Scriptorium team led by Gretyl Kinsey (with help from Simon Bate, Jake Campbell, and me). logoHere’s the course outline:

  • Lesson 1: Creating a concept topic
  • Lesson 2: Images and tables
  • Lesson 3: More elements
  • Lesson 4: Advanced elements
  • Lesson 5: XML overview and best practices

Don’t worry: we aren’t neglecting other topic types! Two more courses are coming your way in October and November:

The DITA task topic (scheduled for October 2015)

  • Lesson 1: Creating a task topic
  • Lesson 2: Creating steps
  • Lesson 3: Finishing up the task
  • Lesson 4: Best practices for tasks

The DITA glossary entry and reference topics (November 2015)

  • Lesson 1: Creating a glossentry
  • Lesson 2: Creating a reference topic
  • Lesson 3: Best practices for glossaries and references

Ready for some free DITA training? Set up your account today.

Are you already a DITA expert? We could use your help in building the course content. Join the open-source ditatraining project on GitHub.

Special thanks to the Learning DITA sponsors: oXygen XML Editor, The Content Wrangler, easyDITA, and Information Development World.


The friendly guide to the scope, format, and type attributes in DITA

August 24, 2015 by

In testing one day, I was running a set of sample content through the DITA-OT, and much to my consternation, the build was succeeding, but generating no content. The error log helped to ferret out the source of the problem; the preprocessor was attempting to extract a linktext and navtitle from an image file that could not be found.

The image in question was a keyref pulled in from a map referenced in the main map file. Everything validated, previews showed the images resolving correctly, yet the images steadfastly refused to be pulled in during preprocessing—so what was wrong?

The issue lay in the keydef itself:

keydef keys="test_image" href="media/image.png" type="png"

At a glance, this looks fine, but this keydef needs the format attribute, not the type attribute.

In working with clients who are new to DITA, we have noticed that the format attribute and its similar siblings, type and scope, can cause a great deal of confusion.


The scope attribute tells the processor when to validate the associated object based on its relationship to the current map. The default value is ‘local.’ The values it can take are:

  • local: This object is part of my dataset, so validate it at runtime.
  • peer: This object is not part of my dataset, but is maintained by another group, so don’t validate it
  • external: This object is not part of my dataset, so I can’t validate it.


The format attribute tells the processor how to validate the associated object. The default value is ‘dita.’ Typical values include ‘dita,’ html,’ ‘pdf,’ and ‘ditamap,’ but a good guideline is to have it match the file extension of the referenced object, like this:

<keydef keys="test_image" href="content/test_image.svg" format="svg"/>

Here, the processor will see that the referenced file is an svg, and attempt to validate it as such. In the anecdote earlier, since the format attribute was not explicitly declared, the processor attempted to validate the referenced svg as if it were a DITA file, causing a cascade of validation errors that led to no content being output.


In the context of linking elements, the type attribute tells the processor what kind of object you’re referencing. The default value is the value of the closest ancestor or, failing that, topic. You generally set it to either the element type being referenced (table, fig, note) or the topic type (task, reference), like this:

<xref href="topics/duck_species.dita#ducks/mallard" type="fig"/>

Here, the processor will see that you’re referencing a figure, then perform any additional processing dictated by the transform. You can also supply an arbitrary value and use that when you process the content to create output.

Putting it all together

Taken together, these attributes describe how linked content is related to your current content, what is being linked, and how it will be presented to the user.

<xref href="" scope="external" format="html" type="media_video">Classic Music</xref>

Understanding what these attributes represent, how they are evaluated, and how they function when you don’t explicitly declare them is key to writing content that not only works, but works the way you want it to.

Tech comm skills: writing ability, technical aptitude, tool proficiency, and business sense

August 17, 2015 by

Technical Writing is only about what software you know! Is that why every where I read any type of document, web page, or article it is FULL of misspellings, incorrect punctuation, and horrible formatting?!!

That’s what started a thread on LinkedIn that encapsulates long-running debates on the skill sets technical writers need. (The thread was removed from LinkedIn sometime after Friday, unfortunately.)

From my point of view, a good technical communicator possesses a balance of writing ability, technical aptitude, and software skills. Problems arise when that mix of skills is off-kilter:

  • Grammatically pristine content that just scratches the surface of a product reflects a lack of technical understanding and reduces tech comm to stenography.
  • Overly technical content that catalogs every feature of a product demonstrates technical depth but no writing ability. Such content is usually badly organized (writing about every menu choice in order is not good organization) and littered with grammatical and spelling mistakes.
  • Proficiency in the tools for creating content means information development is more efficient, but blind devotion to a tool is a big (and unprofessional) mistake.

A lot of commenters in the thread touch on these aspects, but at the time I wrote this post, there was a glaring omission among the discussed skill sets:  an understanding of business.

Business requirements should drive all content-related efforts at a company, so it’s vital that content creators—technical writers included—understand how their content supports company goals (or not, as the case may be). Changes to content (new tools, new publishing formats, and so on) must be carefully vetted to determine whether there is a solid business case to make such changes. For example, you propose implementing an XML-based workflow because you have numbers showing cost savings. “Other companies are doing it” and “a software vendor told me we need it” are not business cases.

Writing ability, technical aptitude, and dexterity with software are important skills for technical writers to have. But understanding how your efforts connect to the company’s business requirements is what gives you the edge in making your tech comm work indispensable.

Design versus automation: a strategic approach to content

August 10, 2015 by

Design and automation are often positioned as mutually exclusive–you have to choose one or the other. But in fact, it’s possible to deliver content in an automated workflow that uses a stellar design. To succeed, you need a designer who can work with styles, templates, and other building blocks instead of ad hoc formatting.

More content across more devices requires scalability–and that means more automation. A strategic approach to content needs to incorporate both design and automation as constraints and find the right balance between the two.


First, a few definitions.

Design–specifically graphic design–is the process of deciding how information is presented to the person who needs it. The design effort may include text, graphics, sound, video, tactile feedback, and more, along with the interaction among the various types of information delivery. In addition to the content itself, design also includes navigational elements, such as page numbers, headers and footers, breadcrumbs, and audio or video “bumpers” (to indicate the start or end of a segment). Sometimes, the designer knows the final delivery device, such a kiosk in a train station or a huge video board in a sports stadium. In other cases, the delivery is controlled by the end user–their phone, desktop browser, or screen reader.

Automation is a workflow in which information is translated from its raw markup to a final packaged presentation without human intervention.


Design and automation are not mutually exclusive.


Instead, think of design and automation as different facets of your content. Each quadrant of the design-automation relationship results in different types of documents. High design and low automation is where you find coffee table books. High automation and low design encompasses invoices, bad WordPress sites, and 30 percent of the world’s data sheets. Low design/low automation is just crap–web sites hand-coded by people with no skills, anything written in Comic Sans, and the other 70 percent of data sheets. (Seriously, where did people get the idea that using InDesign without any sort of styles was appropriate for a collection of technical documents? But I digress…)

The interesting quadrant is the last one: high design and high automation. In this area, you find larger web sites, most fiction books, and, increasingly, marketing content (moving out of lovingly handcrafted as automation increases) and technical content (moving out of “ugly templates” and up the design scale).

Design/automation on X/Y coordinates. Automation is the X axis; design is the Y axis.

Design and automation and different facets of content creation.

The world of structured content inhabits a narrow slice on the extreme right.

Automation on the X axis; design on the Y axis. Structured content is a band in the high-automation area.

Structured content goes with high automation.

Design gets a similar swath of the top.

Automation on the X axis; design on the Y axis. Design-centered content is a band in the high-design area.

Design-centered content at the top of the design region.

When you combine a requirement for high design with a requirement for high automation, you get The Region of Doom.

Same grid at before. The region of doom is the top left corner, where you have extreme design and extreme automation requirements.

You can accommodate 90% design and 100% automation or 90% automation and 100% design, but if you are unwilling to compromise on either axis, expect to spend a lot of money.

A better strategy is to focus on the 90% area. By eliminating 5–10% of the most challenging design requirements, or by allowing for a small amount of post-processing after automated production, you can get an excellent result at a much lower cost that what the Region of Doom requires.

The intersection of design and automation bars in the upper right is the best value.

Small compromises in design and/or automation result in big cost savings.


When we discuss design versus automation, we are really arguing about when to implement a particular design. An automated workflow requires a designer to plan the look and feel of the document and provide templates for the documents. The publishing process is then a matter of applying the predefined design to new content.

The traditional design process ingests content and then people apply design elements to it manually.

In other words, automation requires design first, and this approach disrupts the traditional approach to design. Like any disruptive innovation, this new approach is inferior at first to the “old way” (hand-crafting design). As the technology improves, it takes over more and more use cases.

Disruptive technology first takes over the low end of the market, and then gradually moves up to more demanding users.

Evolution of disruptive technology over time, public domain image found at Wikimedia

Travis Gertz writes an impassioned defense of editorial design in Design Machines: How to survive the digital apocalypse:

Editorial designers know that the secret isn’t content first or content last… it’s content and design at the same time.

[…] When we design with content the way we should, design augments the message of the content.

[…] None of these concepts would exist if designed by a content-first or content-last approach. It’s not enough. This level of conceptual interpretation requires a deep understanding of and connection to the content. A level we best achieve when we work with our editors and content creators as we design. This requirement doesn’t just apply to text. Notice how every single photo and illustration intertwines with the writing. There are no unmodified stock photos and no generic shots that could be casually slipped into other stories. Every element has a purpose and a place. A destiny in the piece it sits inside of.

He provides wonderful examples of visuals entwined with content, mostly from magazine covers. And here is the problem:

  • As Gertz acknowledges earlier in the article, for many small businesses, basic templates and easy web publishing are a step up from what they are otherwise able to do. Their choice is between a hand-coded, ugly web page (or no web presence at all), and a somewhat generic design via SquareSpace or a somewhat customized WordPress site. Magazine-level design is not an option. In other words, automation gives small business the option of moving out of the dreaded Meh quadrant.
  • What is the pinnacle of good design? Is it a design in which graphics and text form a seamless whole that is better than the individual components? Many designers forget that not everyone can handle all the components. A fun audio overlay is lost on a person who is hard of hearing. Without proper alternate text, a complex infographic or chart is not be usable by someone who relies on a screen reader.
  • The vast majority of content does not need or deserve the high-design treatment.
  • An advantage of visual monoculture is that readers know what to expect and where.
  • All of these examples are for relatively low volumes of content. I don’t think these approaches scale.


What do you think? Scriptorium builds systems that automate as much as possible, and then use additional resources as necessary for the final fit and finish. Only some limited subset of content is worth this investment. I know that I have a bias toward efficient content production so that companies can focus on better information across more languages.

For more on this topic, come see my session at Big Design Dallas in mid-September.

Making metadata in DITA work for you

August 6, 2015 by

Metadata is one of the most important factors in making the most of your DITA XML-based content environment. Whether you’re converting legacy content into DITA or creating new structured content, it’s important to know what metadata (or data about data) your files will keep track of and why. Coming up with a plan for using metadata can be tricky, so here are some tips to make the process easier.

Using metadata to track content

Metadata can be a powerful aid in organizing your content. It can help speed up your production workflow by allowing you to find all content by a certain author or all content that needs to be reviewed. It can help with distribution by filtering content according to intended audience or location. It can also make content searches easier for both internal and external users. Because metadata exists at both the topic level and the element level, it offers lots of flexibility in content filtering and search.

Before you begin converting or authoring content, think about what metadata you’ll need to track in your content and why. You may need to distinguish between different types of content (informational, instructional, legal), different audiences (internal, external), or different products. Depending on the type and volume of content you create, your metadata needs may be very specific and complex. Making a list of required metadata and how you plan to use it will make it easier to implement your new structured content workflow.

Standard metadata

Learn as much as you can about the DITA standard and what it offers when it comes to metadata. In DITA, you will have certain metadata attributes and elements available by default:

  • author
  • audience
  • product
  • and many others

To determine how well the DITA standard metadata supports your needs, compare it with your metadata requirements and see if you can find an existing attribute or element that matches each item on your list. You may have some requirements with close but not exact matches, or others that are too specific to your company for a match. In that case, you may want to consider metadata specialization.


Specializing metadata attributes and elements can help you organize and filter your content in ways that are tailored to your company’s unique needs. You may need to capture much more information about your product than the DITA standard metadata allows. You may also need to filter content in a hierarchical or multi-faceted way – for example, distributing content to certain locations, and within those locations, only to employees with certain job titles. DITA allows you to specialize metadata elements to include multiple values, which makes this kind of filtering possible.

Although specialization can be highly useful, it can also present challenges to implementation. If you do specialize, be sure to choose a content management system and other tools that can support your changes to the standard metadata attributes and elements. It’s always better to stick to the standard and only specialize if you absolutely must—that way, you’ll be able to choose from a wider selection of tools that will support your content.

Content management

A component content management system (or CCMS) can use metadata to enhance your content development workflow. A CCMS may be equipped to filter on metadata, which helps authors and reviewers find the content they need and tells the publication engines what output to produce. It may also connect with an authoring tool to populate certain metadata values automatically, such as author or status, when a content creator logs into the system.

When you evaluate CCMS options, you should not only ask whether each CCMS you’re considering supports your metadata needs, but also how it manages metadata. How does the CCMS use the metadata in your content to help with workflow? How flexible is the CCMS if you start with standard metadata now but need to specialize it later? What happens to metadata that is created and managed by the CMS if you ever need to move your content into a different system? Having a solid plan in place for metadata use before you choose a CCMS and other tools will help you ask the right questions and make the best decision.