Scriptorium Publishing

content strategy for technical communication

Structured authoring AND the web

May 15, 2013 by

We read Tom Johnson’s post on Structured authoring versus the web with some dismay. Tom is a persuasive, influential writer, but his article misses the mark in important ways.

Sarah O’Keefe and Alan Pringle contributed to this post.

Let’s start from the conclusion:

I don’t want to come across as being against structured authoring. As I mentioned in the introduction, clearly structured authoring is a trend many companies are following. However, structured authoring has a few challenges before it can live in a web environment. In this post, I mentioned a few trends that I think pose challenges to structured authoring:

  • SaaS decreases the requirement to support versioned content.
  • Agile makes print publications potentially out of date every two weeks.
  • Collaboration requires a form that SMEs can edit, update, and potentially author themselves.
  • Mobile works best on a website with responsive design.
  • Budget cutbacks force small teams to figure out their own publishing solutions.
  • Open source platforms provide a lot of capabilities that we can easily leverage.
  • Browser-based editing simplifies the update process, which helps us keep up with rapidly changing information.

Here are some counterpoints:

SaaS. What percentage of companies are producing just SaaS products? The dubious assertion that SaaS eliminates the requirement to version documents dismisses medical devices, pharmaceuticals, and all hardware products from consideration. Many companies that offer SaaS options also have packaged software.

Agile. Agile is definitely a challenge for print publications. But what does print have to do with structured authoring? The true question is, “Should we create print or not?” The underlying methodology is irrelevant—unless you are conceding that web-based authoring doesn’t really support print. And again, what of industries that require print or PDF?

Collaboration and SME editing. It’s possible to give subject matter experts direct editing access to either XML or HTML. Don’t want SMEs to have final say in the modifications they make to content? Set up their access to enable changes that require approval; there are hosted systems for managing structured content that provide tiered account access.

Responsive design. This one had us truly puzzled. Generating content that uses responsive design is a configuration issue, so it’s easily done in whatever technology you have that produces HTML and CSS. But then Tom asserts that using responsive design from XML means that you are ignoring multichannel publishing. This is true only if your multiple channels are “desktop” and “mobile.” But most companies are also producing PDF/print, and we are seeing increasing requirements for EPUB (yes, EPUB!!!), XML for consumption by other systems, and more.

Budget cutbacks. From the body of the post:

With many tech writing teams constrained with small budgets and few resources, hiring a dedicated publishing engineer to handle the transforms, or contracting out the work at a high cost, really isn’t a practical solution.

Look at all those scary adjectives! In response, we’ll ask our standard consultant questions: Is there a business case? Nobody hires “publishing engineers” (which, by the way, is a great description of the job) or contracts work out at “high cost” (another happy phrase when we’re on the receiving end!) unless there is value in doing so. What is the value?

Open source platforms. You’ll get no argument from us that open source platforms “provide a lot of capabilities”: WordPress has its place, as does the DITA Open Toolkit and all the open source technologies it includes. What’s important to remember about open source tools is that they usually require tailoring and the addition of new features to fully meet a company’s particular needs. Yep, it’s time to haul out the “free but not cheap” chestnut: the default implementation of any open source technology is rarely enough to meet your requirements, so be prepared to spend time and money to implement all the features you need. Some companies may calculate it’s cheaper to buy a tool that fulfills their requirements out of the box instead of trying to build a comparable open source solution.

Browser-based editing. Tom says that “you shouldn’t have to author outside the browser to publish on the web.” Well, you don’t have to. Multiple hosted systems for managing structured content include browser-based authoring and editing. That said, there are good reasons why a company may not want browser-based editing or any hosted tools: bandwidth limitations, security, and so on.

What’s missing from Tom’s discussion is a recognition that business requirements should drive the decision on structured authoring, the formats in which a company provides content, the selection of open source or proprietary tools, and so on. Instead, his post creates a narrow rhetorical funnel (if you make SaaS software and eliminate PDF and have no resources) and then asserts that a particular solution is best. Given the constraints Tom describes, you can make a case for web-only authoring. But what percentage of tech comm actually falls into this very narrow description?

Finally, we didn’t see any recognition of key tech comm requirements—localization and conditional content—that often drive tool decisions.

Structured authoring and the web aren’t mutually exclusive. You can combine both into a useful, dynamic approach to content.

Making a commitment to accessibility

May 13, 2013 by

Are you delivering accessible content?

Over the weekend, I ran across a not-so-delightful article about a copyright treaty that might help with accessibility:

For the last several years, negotiators at the World Intellectual Property Organization have been working on a copyright treaty that would make it easier for blind people to get accessible versions of books, like well-annotated audio books or large-print editions. But aggressive lobbying by the Motion Picture Association of America (MPAA), the Association of American Publishers (AAP), and other US copyright interests threatens to derail the negotiations, according to several advocates for the blind who spoke to Ars.

For most technical content, the copyright issue is secondary. Instead, content is inaccessible simply because the content creators (that’s us!) have not made the effort.

For example, Scriptorium webcasts are currently delivered on Slideshare. We upload the audio file and sync it with the slides to produce a “slidecast” with video and audio. However, there is currently no text version, caption, or transcript, although many of our webcasts do have companion blog posts or white papers. We are still trying to figure out the best solution. (Any ideas out there? The best suggestion so far appears to be to post the video on YouTube, which will automatically caption it.)

As professional communicators, you and I have an obligation to deliver content that our readers can use. Building in accessibility is not particularly difficult for most documents.

We routinely tailor content to our audience; for example, by writing at a sixth-grade level, without any prompting from management. We should do the same for accessibility.

 

Webcast: The state of the tech comm industry

April 30, 2013 by

In this webcast recording, Sarah O’Keefe, Scott Abel (The Content Wrangler), Race Bannon (Oracle), and Paul Perrotta (Juniper Networks) discuss the state of the technical communication industry.

Scott shares the results of his benchmark survey. Scott, Race, and Paul then provide insights on tech comm industry trends, challenges, and innovations, based on the survey results along with their real-world experiences.

Scott’s benchmarking survey, which we discussed during the webcast, is also available on slideshare:

Cheap writers can be expensive

April 29, 2013 by

Given the choice between an inexpensive writer with a limited skill set and a professional technical communicator, which should you choose?

First, a disclaimer. All of these numbers are estimates and based on anecdotal experience rather than solid research. If you want something academically defensible, you are in the wrong place.

I’ve been considering whether cheap writers are actually cheap. So I did some basic calculations:

Let’s compare a cheap writer at $40,000 per year to an expensive technical communicator at $80,000 per year.

Cheap Expensive
Base salary $40,000 $80,000
Benefits 30% 30%
Total annual cost $52,000 $104,000
Cost per hour (1800 hours) $28.89 $57.78

Hmmm. Doesn’t look too good for the expensive technical communicator.

Now, let’s describe our two contenders:

  • Cheap writer: Hired into a technical writing job with little or no previous experience in tech comm. Perhaps a background in marketing writing or in technical support. Not much expertise with tech comm tools or templates. Definitely lacking an understanding of writing for localization. Little interest in learning the profession.
    Primary reason for hire: Cost.
    Motto: “Style guides are annoying constraints on my creativity.”
  • Expensive technical communicator: Previous technical writing experience (5-10 years). Knows and understands tech comm tools. Follows templates and style guides religiously. Always looking for opportunities to make content production process more efficient. Writes short, clear, jargon-free sentences as a matter of course. Looks for opportunities to improve knowledge of products (sit in with technical support, attend customer training classes).
    Primary reason for hire: Combination of writing ability, domain knowledge, and tools.
    Motto: “Reuse is my friend.”

(Yes, I exaggerate. Remember, this is a blog. If you are looking for reasoned academic discourse, please move on immediately.)

Now for the part where I invent productivity numbers that you can challenge.

Here are my assumptions:

Cheap Expensive
Time per topic, hours 4 2
Yearly output, topics 250 500
Editing load, percentage 30% 15%
Editing time, hours per topic 1.2 0.3
Localization efficiency 5% 25%

Some explanations:

  • I assume that each writer has 1,000 hours of actual writing time per year. One writer takes about four hours to write a topic; the other takes about two hours. This differential is based on more efficient use of tools (less flailing around looking for the right way to do something in the authoring tool), better understanding of the products (less research required), and better productivity with the actual writing.
  • The editing load is the amount of time required to edit or review a topic. The more experienced technical communicator produces content that requires less editing and less time asking the editor (or a mentor) questions.
  • Localization efficiency is mostly a measure of reuse. The expensive writer produces content in which 25% of information can be reused and therefore translated automatically. (There are other considerations, such as the correct application of templates. That cost is included in the editing load.)

Based on these numbers, we discover the following:

Cheap Expensive
Total cost per topic $150.22 $132.89
Total cost per 250 topics $37,555.56 $33,222.22

Not looking so cheap any more, that cheap writer…

(The cost formula is to take the loaded hourly rate and multiply by the total time per topic. For the cheaper resource, that’s $28.89 per hour times (4 hours writing plus 1.2 hours editing).)

Now we come to the Big Kahuna—localization cost:

Cheap Expensive Difference
Number of words to localize per 250 topics (62,500 words minus localization efficiency factor) 59,375 46,875 12,500
Cost per language @ 25 cents per word $14,844 $11,719 $3,125
Translation cost for 10 languages $148,438 $117,188 $31,250
Total cost of creation, 250 topics, 11 total languages $185,993 $150,410 $35,583

The details:

  • Here, we factor in localization efficiency. Both writers have a total of 62,500 words in their 250 topics. But the cheap writer gets only 5% credit for localization efficiency whereas the expensive writer gets 25%. Therefore, the actual word count for localization is much higher for the cheap resource.
  • We then calculate the cost of localization of the actual word count sent for localization.
  • The cost of translation into 10 languages is over $30,000 lower for the better content. This dwarfs the cost savings on writing the original content.

It’s also worth noting that 25% is relatively low. In more mature content creation environments, a more typical number is around 50%.

The expensive technical communicator who produces consistent, reusable content saves huge amounts of money in localization. Cheap writers can be very expensive.

The full spreadsheet is available as a public Google doc. Feel free to make yourself a copy and do your own calculations.

The truth is out there

April 4, 2013 by

One of the most important issues in technical content is to establish a single source of truth for technical data. More often than not, our workflow assessments uncover multiple sources of dubious accuracy.

sisyphusGiven a workflow in which information is extracted from a database, dropped into a layout tool, and then published to print, you might assume that the database is the source of truth. Unfortunately, the database is usually incomplete or error-ridden, and the technical communication team is not permitted to change the database. This results in the following workflow:

  1. Extract iffy information from the database.
  2. Import bad information into layout files.
  3. Make corrections and updates in the layout files.
  4. Do random additional formatting tweaks because you’re spending so much time in the layout anyway.
  5. Publish corrected information.

Notice the absence of:

  1. Correct information in the database.

Without corrections in the database, the most accurate source of truth is the layout files, in which information has been scrubbed and reviewed.

This is not a good thing.

One of the most common recommendations we have for improved workflows boils down to:

Fix the $#@!$#@! data.

(We generally phrase it more tactfully.)

If you move the source of truth back into the database (the base of the data), you can create a workflow like this:

  1. Update the database with good information.
  2. Extract information from the database.
  3. Import into layout files and publish.

A few  things make this workflow compelling:

  1. In the original version, the correction work is infinite. The technical communication team must either rework the content every time they export from the database, or they must maintain their layout files as a repository of truth and integrate updates one by one. Neither approach is appealing or efficient.
  2. In the new version, the database updates will, over time, make the database better and better instead of being a Sisyphean task.
  3. If the database content is accurate, it’s possible to import and publish with minimal (or zero) intervention in the actual layout files. The process might take an hour instead of a weeks or months.

If your workflow is nothing but a series of workarounds that are driven by bad data, it’s time to consider a new plan of attack.

Rebranding as a business case for XML

March 18, 2013 by

Reuse and automated formatting are the most common justifications for XML, but recently, we have heard a new reason from several customers: rebranding.


Get'em while they last!It’s a common scenario: The organization you work for gets acquired (and renamed), has a new branding campaign (new logo, new name), changes its location (new address), or spins off your division (new everything).

In traditional layout files, this can result in a huge rebranding effort. You end up having to change the copyright pages, headers and footers, inline references to the company name and product names, and so on. Good templates help, but someone still has to open each file, apply the new settings, and regenerate the output.

If the files are less than completely consistent, applying templates can be problematic. The effort to make these changes can be shockingly high, especially if your content is subject to any sort of regulatory approval.

In several recent cases, our clients have calculated that the cost of minor formatting updates (which require extensive manual intervention) is comparable to the cost of converting the files to XML. So the solution to the problem of changing the logo is to move the content to XML and use automated formatting.

Given the assumption that there will be similar changes in the future, it’s a sound investment.

The politics of DITA

March 6, 2013 by

Deciding on a content model is a critical step in many of our projects. Should it be DITA or something else? The answer, it seems, often has more to do with our client’s corporate culture than with actual technical requirements.

DITA adoption in Germany

Mardi Gras beads

Adoption of DITA in the German market is much lower than in the North American market. Some interesting factors are at play here:

  • A lot of German tech comm is heavy machinery, which is governed by the European Union’s machinery directive. A lot of North American tech comm is software, which is governed by, well, nothing.
  • Many German companies standardized their tech comm efforts a long time ago and view DITA as a half-baked newbie.
  • The German market is full of (relatively) inexpensive content management systems, most of which use proprietary content models. CMS evaluation in Germany is often driven by workflow and not content models.
  • There is at least a touch of Not Invented Here syndrome.
  • Overall, German tech comm is more concerned than North American tech comm about localization and less concerned about non-PDF deliverables.

The lure of custom XML

2009 04 19 - 4703 - Washington DC - Natural History Museum - Mackay Emerald and Diamond Necklace

Building a custom content model is appealing to some clients. They are typically found in industries that demand precision (that is, rarely software but rather medical devices or other regulated industries) and have staff with a high degree of technical expertise. The interest in custom XML rests on the following assumptions:

  • Their content is special and no mere standard can support it. (This belief is almost always incorrect except for their metadata requirements.)
  • Implementing custom XML will make the transition easier for the content creators, who are highly qualified in the subject matter (such as nuclear power plants) that they write about but not comfortable learning new publishing technology.
  • Using custom XML makes it possible to clone the existing (implicit) structure onto a content model, thus neatly avoiding change management issues. (Highly unlikely.)

There is also a strong correlation between custom XML advocates and FrameMaker aficionados. The thinking seems to be that a transition from unstructured FrameMaker to structured FrameMaker is easier than moving to a non-FrameMaker XML editor. (As if!) And since structured FrameMaker can happily support custom XML, then why not use it?

Data interchange

Pearls

DITA is or was billed as a way to exchange content. For data interchange, however, we find that DITA is not compelling. We have worked with several customers who had a data source (usually a database) and needed to extract data, format it, and align it with additional information from elsewhere. There are two basic options to attack this problem:

  1. Database to XML, XML to publishing tool, integrate with additional content in publishing tool
  2. Database to XML, XML to DITA, integrate with additional content in DITA, DITA to publishing tool

Most customers chose option 1. The advantages of integration in the DITA layer are not compelling enough to justify the investment required to build the database to DITA configuration system.

This is the outlier case where technical considerations drove the decision.

Technology risk

DITA (and XML) are perceived as much riskier than publishing or help authoring tools. The argument here is “What if Key Technical Person leaves? Nobody else would be able to maintain the DITA system.”

Does DITA offer compelling benefits? If so, why would an organization allow a single person to be the only expert on this technology?

We refer to this as the Bus Problem. The success or failure of your system should never be dependent on a single person. What if that person gets hit by a bus? (Or leaves the organization? Or retires? Or joins the witness protection program??)

If you have only one person who is capable of understanding the intricacies of a DITA implementation and no possibility of hiring or training more people, then you have a serious problem. It’s not just a DITA problem, though.

Adapt or die: Managing increasing content velocity

February 12, 2013 by

Content velocity is the speed at which we create and produce content, the speed of the publishing process itself, and the speed of change in content requirements—what we need to produce and the delivery mechanisms.

This is a summary of a presentation delivered at the Intelligent Content Conference on February 8, 2013, in San Francisco.

We must adapt to changes (Image: NASA)

It’s not as though publishing has been trivial until now. Surviving and thriving as a content creator has always been difficult. But now, the environment is changing drastically.

We must adapt, even though we are not sure what we need to survive in the new environment. The asteroid—digital publishing—has hit us. We are seeing changes already, but are these temporary or permanent? What are the most profound changes? By comparison, the initial asteroid impact would have caused a huge shock wave, tsunamis, and other immediate disasters. But it was the dust thrown into the atmosphere that caused climate change and wiped out the dinosaurs.*

The impact of digital publishing is that content creators must stop operating as a cottage industry or in an artisanal bubble. We no longer have a margin for error in this new world.

Ostrich (Female)There are dinosaur descendants in today’s world, but they look completely different than their ancestors.

What are the implications for us after the arrival of digital publishing? How do we climb out of the crater and deal with the new landscape?

One key is velocity. Publishing speed is increasing in every content dimension. Most publishing systems are ill-equipped for flexible, fast, and changeable requirements. They are equipped to support a manufacturing process, not a digital process.

Authoring velocity

A requirement for faster authoring means that collaborative authoring will become the norm. A modular, collaborative approach, along with controlled language and terminology, speeds up authoring at the expense of individual voices. To make authoring faster, publishing and formatting responsibility will be taken away from authors.

Editing velocity

Software will take on some of the traditional human editor responsibilities; particularly, enforcement of required content structure and controlled language. The conformance edit will largely move into the authoring phase.

Production and distribution velocity

The process of production editing—putting the final polish on a specific output format—will disappear for almost all content. Instead, we will speed up the velocity of content production with automated formatting.

For online content, the friction of distribution is eliminated. This is not exactly new information. But most workflows, especially in technical communication, are still built on the assumption that print or maybe PDF is the most critical business driver. Creating a physical book takes time, so delays in content creation are acceptable. But now, creating the “artifact”—the actual thing that people can read—doesn’t take any time. Blog posts are distributed in a split second. That means we can’t hide behind the inefficiencies of the distribution process any more.

If you have done traditional book publishing, you probably remember these phrases:

  • Bluelines
  • Galleys
  • AAs — no, not AA, but author’s alterations
  • Film plates
  • Stripping
  • Page signatures
  • Old type

They used to be necessary to produce a book; now they are headed for historical status.

Localization velocity

Translation delays should be measured in days, not weeks. To achieve this, reuse of translated content is critical. Other factors that help increase translation velocity:

  • Reuse in source files
  • Machine translation
  • Translation memory
  • Terminology management and controlled language

Velocity of new ideas
Right now, we’re all talking Kindle and EPUB production, along with mobile strategies. But what’s next?

Intelligent content, meaning information integrated with the product. We have this in software as context-sensitive help; now think about the equivalent in hardware. Your refrigerator’s light bulb burns out, so you get a notice on the fridge screen, along with instructions on how to swap out the light bulb.

In version 2, the fridge orders its own light bulb from amazon.

In version 3, the light bulb is printed on your in-house 3D printer and delivered to the fridge by your house robot.

In version 4, who knows??

We don’t know what’s coming, but we do know that we must shred the veil that separates information, data, and products. That’s what intelligent content is all about. And when we free content from its physical bindings, we can start to see the real potential of the information age.

Digital is the technology.

Velocity is the requirement.

Intelligent content helps us solve the velocity problem by making the content itself richer and by making it possible to connect the content with the product.

* At least, that’s the best current scientific theory. None of us actually observed the event directly.

Webcast: Trends in technical communication, 2013

January 17, 2013 by

Our trends webcast has become an annual event, and it’s our most popular webcast! Each year, we take our best shot at trends for the upcoming year with a mixture of serious and not-quite-serious predictions. In this webcast recording, Sarah O’Keefe and special guest Bill Swallow, aka techcommdood, share their perspectives on trends for 2013.