Scriptorium Publishing

content strategy for technical communication

Structured authoring AND the web

May 15, 2013 by

We read Tom Johnson’s post on Structured authoring versus the web with some dismay. Tom is a persuasive, influential writer, but his article misses the mark in important ways.

Sarah O’Keefe and Alan Pringle contributed to this post.

Let’s start from the conclusion:

I don’t want to come across as being against structured authoring. As I mentioned in the introduction, clearly structured authoring is a trend many companies are following. However, structured authoring has a few challenges before it can live in a web environment. In this post, I mentioned a few trends that I think pose challenges to structured authoring:

  • SaaS decreases the requirement to support versioned content.
  • Agile makes print publications potentially out of date every two weeks.
  • Collaboration requires a form that SMEs can edit, update, and potentially author themselves.
  • Mobile works best on a website with responsive design.
  • Budget cutbacks force small teams to figure out their own publishing solutions.
  • Open source platforms provide a lot of capabilities that we can easily leverage.
  • Browser-based editing simplifies the update process, which helps us keep up with rapidly changing information.

Here are some counterpoints:

SaaS. What percentage of companies are producing just SaaS products? The dubious assertion that SaaS eliminates the requirement to version documents dismisses medical devices, pharmaceuticals, and all hardware products from consideration. Many companies that offer SaaS options also have packaged software.

Agile. Agile is definitely a challenge for print publications. But what does print have to do with structured authoring? The true question is, “Should we create print or not?” The underlying methodology is irrelevant—unless you are conceding that web-based authoring doesn’t really support print. And again, what of industries that require print or PDF?

Collaboration and SME editing. It’s possible to give subject matter experts direct editing access to either XML or HTML. Don’t want SMEs to have final say in the modifications they make to content? Set up their access to enable changes that require approval; there are hosted systems for managing structured content that provide tiered account access.

Responsive design. This one had us truly puzzled. Generating content that uses responsive design is a configuration issue, so it’s easily done in whatever technology you have that produces HTML and CSS. But then Tom asserts that using responsive design from XML means that you are ignoring multichannel publishing. This is true only if your multiple channels are “desktop” and “mobile.” But most companies are also producing PDF/print, and we are seeing increasing requirements for EPUB (yes, EPUB!!!), XML for consumption by other systems, and more.

Budget cutbacks. From the body of the post:

With many tech writing teams constrained with small budgets and few resources, hiring a dedicated publishing engineer to handle the transforms, or contracting out the work at a high cost, really isn’t a practical solution.

Look at all those scary adjectives! In response, we’ll ask our standard consultant questions: Is there a business case? Nobody hires “publishing engineers” (which, by the way, is a great description of the job) or contracts work out at “high cost” (another happy phrase when we’re on the receiving end!) unless there is value in doing so. What is the value?

Open source platforms. You’ll get no argument from us that open source platforms “provide a lot of capabilities”: WordPress has its place, as does the DITA Open Toolkit and all the open source technologies it includes. What’s important to remember about open source tools is that they usually require tailoring and the addition of new features to fully meet a company’s particular needs. Yep, it’s time to haul out the “free but not cheap” chestnut: the default implementation of any open source technology is rarely enough to meet your requirements, so be prepared to spend time and money to implement all the features you need. Some companies may calculate it’s cheaper to buy a tool that fulfills their requirements out of the box instead of trying to build a comparable open source solution.

Browser-based editing. Tom says that “you shouldn’t have to author outside the browser to publish on the web.” Well, you don’t have to. Multiple hosted systems for managing structured content include browser-based authoring and editing. That said, there are good reasons why a company may not want browser-based editing or any hosted tools: bandwidth limitations, security, and so on.

What’s missing from Tom’s discussion is a recognition that business requirements should drive the decision on structured authoring, the formats in which a company provides content, the selection of open source or proprietary tools, and so on. Instead, his post creates a narrow rhetorical funnel (if you make SaaS software and eliminate PDF and have no resources) and then asserts that a particular solution is best. Given the constraints Tom describes, you can make a case for web-only authoring. But what percentage of tech comm actually falls into this very narrow description?

Finally, we didn’t see any recognition of key tech comm requirements—localization and conditional content—that often drive tool decisions.

Structured authoring and the web aren’t mutually exclusive. You can combine both into a useful, dynamic approach to content.

Making a commitment to accessibility

May 13, 2013 by

Are you delivering accessible content?

Over the weekend, I ran across a not-so-delightful article about a copyright treaty that might help with accessibility:

For the last several years, negotiators at the World Intellectual Property Organization have been working on a copyright treaty that would make it easier for blind people to get accessible versions of books, like well-annotated audio books or large-print editions. But aggressive lobbying by the Motion Picture Association of America (MPAA), the Association of American Publishers (AAP), and other US copyright interests threatens to derail the negotiations, according to several advocates for the blind who spoke to Ars.

For most technical content, the copyright issue is secondary. Instead, content is inaccessible simply because the content creators (that’s us!) have not made the effort.

For example, Scriptorium webcasts are currently delivered on Slideshare. We upload the audio file and sync it with the slides to produce a “slidecast” with video and audio. However, there is currently no text version, caption, or transcript, although many of our webcasts do have companion blog posts or white papers. We are still trying to figure out the best solution. (Any ideas out there? The best suggestion so far appears to be to post the video on YouTube, which will automatically caption it.)

As professional communicators, you and I have an obligation to deliver content that our readers can use. Building in accessibility is not particularly difficult for most documents.

We routinely tailor content to our audience; for example, by writing at a sixth-grade level, without any prompting from management. We should do the same for accessibility.

 

From toilets to techcomm: tallying tool risks

May 9, 2013 by

I’m about to replace an old toilet, not-so-affectionately nicknamed the Lazy River.


As you might guess, the Lazy River is barely doing its job. I’ll give the toilet points for longevity—it was installed in the mid-1980s—but the Lazy River uses three gallons of water for each leisurely operation. It’s beyond time for a more efficient model.

Lazy river (flickr: archer10)

The leading contender to replace the Lazy River was an aesthetically pleasing option (well, as far as toilets can be) with an impressive warranty and a highly competitive price. And then I read a forum post about that product that changed everything.

The flushing mechanism and water supply line for this toilet rely on proprietary technology. (Proprietary. Toilet. Technology.) That would probably mean ordering any replacement parts directly from the manufacturer. What if the company decided to discontinue making toilets? The so-called “universal” replacement parts wouldn’t work, and I’d be stuck with a nice-looking toilet that I couldn’t repair. No thanks!

I ended up choosing another option. Replacement parts for it are more widely available, and universal parts will work, too. I have no worries about buying a unit that will become obsolete if the manufacturer quits making that model.

The lessons I learned with my toilet purchase most definitely apply to the technologies and tools that produce technical content. Consider the long-term risks associated with choosing a particular tool.

Proprietary storage formats for source files are a large consideration, and you should account for both a toolmaker’s longevity and the continuity of its products. For example, is there is a history of products being poorly supported or discontinued altogether?

Poking around industry forums and wikis can help you gather this crucial intelligence. Talk to your peers at conferences and industry events—and don’t get snowed by fancy vendor demos and slick marketing that camouflage risks.

Before you choose a solution, figure out an escape hatch that will let you extract your content from the tool. For example, can you export source files to a generic format that you could transform and modify further? This is where having XML as a base technology is very useful: you can often programmatically transform one set of XML tags to to another with minimal human intervention. In a sense, XML is a universal replacement part.

Open-source tools should not be exempt from scrutiny. Sure, an open-source technology may not rely upon a proprietary file format like many off-the-shelf tools do, but how well supported is that technology across the industry? Has it reached a critical mass that means you could, for example, choose from several consultants to help you develop your implementation? Or are you pretty much on your own with an open-source tool that has spotty or nonexistent documentation? And always remember: the free license associated with an open-source tool does not mean it will be cheap to implement and maintain.

While evaluating content development tools, it’s critical to consider the longer-term risks of choosing a particular solution and to understand your options if you choose to move to another tool later. Otherwise, you may find yourself up a not-so-lazy river without a paddle.

Rapids (flickr: amerune)

Cheap writers can be expensive

April 29, 2013 by

Given the choice between an inexpensive writer with a limited skill set and a professional technical communicator, which should you choose?

First, a disclaimer. All of these numbers are estimates and based on anecdotal experience rather than solid research. If you want something academically defensible, you are in the wrong place.

I’ve been considering whether cheap writers are actually cheap. So I did some basic calculations:

Let’s compare a cheap writer at $40,000 per year to an expensive technical communicator at $80,000 per year.

Cheap Expensive
Base salary $40,000 $80,000
Benefits 30% 30%
Total annual cost $52,000 $104,000
Cost per hour (1800 hours) $28.89 $57.78

Hmmm. Doesn’t look too good for the expensive technical communicator.

Now, let’s describe our two contenders:

  • Cheap writer: Hired into a technical writing job with little or no previous experience in tech comm. Perhaps a background in marketing writing or in technical support. Not much expertise with tech comm tools or templates. Definitely lacking an understanding of writing for localization. Little interest in learning the profession.
    Primary reason for hire: Cost.
    Motto: “Style guides are annoying constraints on my creativity.”
  • Expensive technical communicator: Previous technical writing experience (5-10 years). Knows and understands tech comm tools. Follows templates and style guides religiously. Always looking for opportunities to make content production process more efficient. Writes short, clear, jargon-free sentences as a matter of course. Looks for opportunities to improve knowledge of products (sit in with technical support, attend customer training classes).
    Primary reason for hire: Combination of writing ability, domain knowledge, and tools.
    Motto: “Reuse is my friend.”

(Yes, I exaggerate. Remember, this is a blog. If you are looking for reasoned academic discourse, please move on immediately.)

Now for the part where I invent productivity numbers that you can challenge.

Here are my assumptions:

Cheap Expensive
Time per topic, hours 4 2
Yearly output, topics 250 500
Editing load, percentage 30% 15%
Editing time, hours per topic 1.2 0.3
Localization efficiency 5% 25%

Some explanations:

  • I assume that each writer has 1,000 hours of actual writing time per year. One writer takes about four hours to write a topic; the other takes about two hours. This differential is based on more efficient use of tools (less flailing around looking for the right way to do something in the authoring tool), better understanding of the products (less research required), and better productivity with the actual writing.
  • The editing load is the amount of time required to edit or review a topic. The more experienced technical communicator produces content that requires less editing and less time asking the editor (or a mentor) questions.
  • Localization efficiency is mostly a measure of reuse. The expensive writer produces content in which 25% of information can be reused and therefore translated automatically. (There are other considerations, such as the correct application of templates. That cost is included in the editing load.)

Based on these numbers, we discover the following:

Cheap Expensive
Total cost per topic $150.22 $132.89
Total cost per 250 topics $37,555.56 $33,222.22

Not looking so cheap any more, that cheap writer…

(The cost formula is to take the loaded hourly rate and multiply by the total time per topic. For the cheaper resource, that’s $28.89 per hour times (4 hours writing plus 1.2 hours editing).)

Now we come to the Big Kahuna—localization cost:

Cheap Expensive Difference
Number of words to localize per 250 topics (62,500 words minus localization efficiency factor) 59,375 46,875 12,500
Cost per language @ 25 cents per word $14,844 $11,719 $3,125
Translation cost for 10 languages $148,438 $117,188 $31,250
Total cost of creation, 250 topics, 11 total languages $185,993 $150,410 $35,583

The details:

  • Here, we factor in localization efficiency. Both writers have a total of 62,500 words in their 250 topics. But the cheap writer gets only 5% credit for localization efficiency whereas the expensive writer gets 25%. Therefore, the actual word count for localization is much higher for the cheap resource.
  • We then calculate the cost of localization of the actual word count sent for localization.
  • The cost of translation into 10 languages is over $30,000 lower for the better content. This dwarfs the cost savings on writing the original content.

It’s also worth noting that 25% is relatively low. In more mature content creation environments, a more typical number is around 50%.

The expensive technical communicator who produces consistent, reusable content saves huge amounts of money in localization. Cheap writers can be very expensive.

The full spreadsheet is available as a public Google doc. Feel free to make yourself a copy and do your own calculations.

Five tips for converting content to DITA

April 18, 2013 by

So, you’ve decided to move to a DITA-based workflow. Before you convert your existing content to DITA, consider these five tips, which encompass both big-picture and coding-specific issues.

Meat grinder photo (flickr: klwatts)

flickr: klwatts

  1. Read a book about DITA best practices before you convert. Educating yourself about what coding works well (and doesn’t work so well) in the real world can save you a lot of headaches and rework. Merely reading the DITA specification is not going to give you advice, for example, on the best way to code commands in your content.  The DITA Style Guide by Tony Self is a good resource, and I’m not saying that just because Scriptorium Press published it. That book has provided me with really useful information while working on DITA projects.
  2. Don’t assume the first sentences in a section are the short description for a DITA topic. There is a strong temptation to convert the first bit of information in a section to a topic’s short description (shortdesc element). Don’t succumb to that temptation. In my experience, it is rare that the first sentences in legacy content are a true short description, which should be a standalone summary of a topic’s content. For information on best practices for short descriptions and how different outputs (HTML, PDF, help) from the DITA Open Toolkit use shortdesc elements, see Kristen Eberlein’s Art of the short description.
  3. Use the right topic types for your content. DITA offers four topic types: generic topic, concept, task, and reference, and you should convert your content to match the purposes of those topic types. For example, don’t shoehorn a procedure better suited to a task topic into a concept topic. Yes, the DITA spec will let you code an ordered list in a concept topic that may seem like sensibly coded task. However, when it comes time for transforming your DITA content to HTML or PDF, the styling for procedures may rely on coding specific to the task element. An ordered list in a concept may not be formatted the same.
  4. Consider how cross-references are processed by the DITA Open Toolkit. During conversion, it is a good idea to add ID attributes to items that are commonly referenced (tables and figures, for example); you need those ID attributes to create cross-references to elements. However, just because the DITA spec enables you to put an ID attribute on the title element within a fig or table element, that does not mean you should point to that title element when creating cross-references. For example, in output based on the default XHTML plugin that comes with the DITA Open Toolkit,  a cross-reference to a figure will not work when the xref element points to the title element within the fig element instead of the fig element itself:screenshot showing incorrect cross-reference from xref element pointing to title in fig element
  5. Know that valid DITA content is not the same as good DITA content. Don’t be fooled when a conversion vendor makes a big deal about how quickly it can convert your legacy information into valid DITA. The problems I mentioned in tips 2–4 can exist in valid DITA topics. The validation feature in a DITA authoring tool is not going to tell you, for example, that the two sentences you converted to a short description are not a true short description.

    Valid DITA ≠ semantically correct, useful DITA.

There are many other tips I could offer, but these five are a good starting point. Feel free to share your own conversion tips and war stories below.

 

The truth is out there

April 4, 2013 by

One of the most important issues in technical content is to establish a single source of truth for technical data. More often than not, our workflow assessments uncover multiple sources of dubious accuracy.

sisyphusGiven a workflow in which information is extracted from a database, dropped into a layout tool, and then published to print, you might assume that the database is the source of truth. Unfortunately, the database is usually incomplete or error-ridden, and the technical communication team is not permitted to change the database. This results in the following workflow:

  1. Extract iffy information from the database.
  2. Import bad information into layout files.
  3. Make corrections and updates in the layout files.
  4. Do random additional formatting tweaks because you’re spending so much time in the layout anyway.
  5. Publish corrected information.

Notice the absence of:

  1. Correct information in the database.

Without corrections in the database, the most accurate source of truth is the layout files, in which information has been scrubbed and reviewed.

This is not a good thing.

One of the most common recommendations we have for improved workflows boils down to:

Fix the $#@!$#@! data.

(We generally phrase it more tactfully.)

If you move the source of truth back into the database (the base of the data), you can create a workflow like this:

  1. Update the database with good information.
  2. Extract information from the database.
  3. Import into layout files and publish.

A few  things make this workflow compelling:

  1. In the original version, the correction work is infinite. The technical communication team must either rework the content every time they export from the database, or they must maintain their layout files as a repository of truth and integrate updates one by one. Neither approach is appealing or efficient.
  2. In the new version, the database updates will, over time, make the database better and better instead of being a Sisyphean task.
  3. If the database content is accurate, it’s possible to import and publish with minimal (or zero) intervention in the actual layout files. The process might take an hour instead of a weeks or months.

If your workflow is nothing but a series of workarounds that are driven by bad data, it’s time to consider a new plan of attack.

Pay no GREAT attention to that man behind the curtain

March 25, 2013 by

Every department has its resident tech wizard: the maintainer of the templates, the DITA Open Toolkit, the wiki, and so on. What happens when that wizard flies off to a new kingdom?

Companies have preparedness plans for natural disasters, and they should have one for important personnel, too: the departure of a key technical resource can be debilitating.

To ensure your department doesn’t experience complete brain drain when the techie leaves, here are two things you can do:

  • Choose another technically savvy employee as a second resource.  If you are implementing a new publishing process, have the primary system maintainer and another employee attend the training on the new system. When the maintainer makes changes, the backup should either observe or even help with the modifications. Managers need to account for the time and money it will take for the backup to be involved in training and process modifications. If the backup isn’t involved in training or maintenance, you have a backup in name only—and that’s not too helpful to the department (or fair to the second-line resource). Also, designating a backup doesn’t mean waiting until your tech expert leaves and then telling someone, “You’re it!”
  • Document your system. Most of the folks reading this blog entry are somehow involved in creating content. We know how to explain things with the written word. Apply those skills and create internal user guides, readme files, wiki entries, code comments, or whatever to document your processes and tools. When changes are made, update the content accordingly. Don’t dismiss the documentation of internal systems as overhead “you’ll get to later.” Later will be too late. Guaranteed.

I know these suggestions are commonsensical. However, common sense is often the first casualty in the heat of implementing a new process—or when you’re reeling from the announcement that your primary technical staffer is leaving.

Thinking about the departure of your technical resource before that happens and implementing a backup strategy is essential. At some point, you will need to replace the (wo)man behind the curtain.*

Leave your own departure-preparedness suggestions in the comments below.

 

* Hat-tip to Leigh White, who left the phrase “(wo)man behind the curtain” in a comment on an earlier post.

Rebranding as a business case for XML

March 18, 2013 by

Reuse and automated formatting are the most common justifications for XML, but recently, we have heard a new reason from several customers: rebranding.


Get'em while they last!It’s a common scenario: The organization you work for gets acquired (and renamed), has a new branding campaign (new logo, new name), changes its location (new address), or spins off your division (new everything).

In traditional layout files, this can result in a huge rebranding effort. You end up having to change the copyright pages, headers and footers, inline references to the company name and product names, and so on. Good templates help, but someone still has to open each file, apply the new settings, and regenerate the output.

If the files are less than completely consistent, applying templates can be problematic. The effort to make these changes can be shockingly high, especially if your content is subject to any sort of regulatory approval.

In several recent cases, our clients have calculated that the cost of minor formatting updates (which require extensive manual intervention) is comparable to the cost of converting the files to XML. So the solution to the problem of changing the logo is to move the content to XML and use automated formatting.

Given the assumption that there will be similar changes in the future, it’s a sound investment.

From tech writer to DITA superhero

March 11, 2013 by

In the world of superheroes, technical writers could just slide down a pole or do a clandestine spin to transform themselves into DITA technologists. Of course, nothing is that easy, so what does the transformation from tech writer to DITA superhero really require?

If you’re thinking about becoming a DITA technologist (or are a manager looking to transform a tech writer into a DITA specialist), consider these prerequisite special powers:

  • Are you a self-starter? You must be willing to jump right in and poke around the DITA Open Toolkit and to learn about the many technologies interwoven into it (Ant, XSLT, XSL-FO, and so on). If you have job responsibilities that consume most of your work day, you will need to find a way to reduce your workload to accommodate this exploration, or you’ll need to do it on your own time. Don’t like the idea of exploring, using, and inevitably breaking the toolkit? Then you’re not ready to become a DITA superhero.
  • Can you learn by experience? A big part of mastering the DITA Open Toolkit is learning by experience. You figure out a fix and then reuse that technique to solve similar problems. If you can’t retain and apply (and reapply) your experiences, please don’t report for duty. (By the way, passive learning isn’t going to work—just sitting through a class or reading a book will not give you all the skills you need. You must get your hands dirty to learn the OT.)
  • Can you think and work like a programmer? If the idea of tinkering with code does not appeal to you, maintaining the XSLT and XSL-FO transformations in the toolkit is absolutely not for you.
  • Are you a dogged troubleshooter? The DITA Open Toolkit can be an annoyingly delicate creature. A coding error (a missing angle bracket, for example) in either source files or a transform will completely break your process, and the error messages you get from the toolkit can be extremely unhelpful. Be prepared to spend hours looking for needles in haystacks—Herculean patience and strong determination are required. Getting angry is not going to transform you into a DITA-smashing beast:

What special powers do you think DITA superheroes should have? Let me know in the comments below.

The politics of DITA

March 6, 2013 by

Deciding on a content model is a critical step in many of our projects. Should it be DITA or something else? The answer, it seems, often has more to do with our client’s corporate culture than with actual technical requirements.

DITA adoption in Germany

Mardi Gras beads

Adoption of DITA in the German market is much lower than in the North American market. Some interesting factors are at play here:

  • A lot of German tech comm is heavy machinery, which is governed by the European Union’s machinery directive. A lot of North American tech comm is software, which is governed by, well, nothing.
  • Many German companies standardized their tech comm efforts a long time ago and view DITA as a half-baked newbie.
  • The German market is full of (relatively) inexpensive content management systems, most of which use proprietary content models. CMS evaluation in Germany is often driven by workflow and not content models.
  • There is at least a touch of Not Invented Here syndrome.
  • Overall, German tech comm is more concerned than North American tech comm about localization and less concerned about non-PDF deliverables.

The lure of custom XML

2009 04 19 - 4703 - Washington DC - Natural History Museum - Mackay Emerald and Diamond Necklace

Building a custom content model is appealing to some clients. They are typically found in industries that demand precision (that is, rarely software but rather medical devices or other regulated industries) and have staff with a high degree of technical expertise. The interest in custom XML rests on the following assumptions:

  • Their content is special and no mere standard can support it. (This belief is almost always incorrect except for their metadata requirements.)
  • Implementing custom XML will make the transition easier for the content creators, who are highly qualified in the subject matter (such as nuclear power plants) that they write about but not comfortable learning new publishing technology.
  • Using custom XML makes it possible to clone the existing (implicit) structure onto a content model, thus neatly avoiding change management issues. (Highly unlikely.)

There is also a strong correlation between custom XML advocates and FrameMaker aficionados. The thinking seems to be that a transition from unstructured FrameMaker to structured FrameMaker is easier than moving to a non-FrameMaker XML editor. (As if!) And since structured FrameMaker can happily support custom XML, then why not use it?

Data interchange

Pearls

DITA is or was billed as a way to exchange content. For data interchange, however, we find that DITA is not compelling. We have worked with several customers who had a data source (usually a database) and needed to extract data, format it, and align it with additional information from elsewhere. There are two basic options to attack this problem:

  1. Database to XML, XML to publishing tool, integrate with additional content in publishing tool
  2. Database to XML, XML to DITA, integrate with additional content in DITA, DITA to publishing tool

Most customers chose option 1. The advantages of integration in the DITA layer are not compelling enough to justify the investment required to build the database to DITA configuration system.

This is the outlier case where technical considerations drove the decision.

Technology risk

DITA (and XML) are perceived as much riskier than publishing or help authoring tools. The argument here is “What if Key Technical Person leaves? Nobody else would be able to maintain the DITA system.”

Does DITA offer compelling benefits? If so, why would an organization allow a single person to be the only expert on this technology?

We refer to this as the Bus Problem. The success or failure of your system should never be dependent on a single person. What if that person gets hit by a bus? (Or leaves the organization? Or retires? Or joins the witness protection program??)

If you have only one person who is capable of understanding the intricacies of a DITA implementation and no possibility of hiring or training more people, then you have a serious problem. It’s not just a DITA problem, though.