Palimpsest

The promise of XML publishing

September 2nd, 2010 by David Kelly

Being a technological optimist, I always watch for ideas that provide more, better, faster, cheaper, healthier, and throw in a heap of wow factor while you’re at it. On the other hand, after observing the progress of technological ideas over a few decades, I am beginning to formulate what I call “The Principle of Minimum Adequacy.”

The principle goes like this: Most new technological ideas reach their optimum implementation in five to ten years, and afterwards merely undergo cosmetic changes with few new ideas to transform them. Examples: cars are still a frame with four wheels and an engine; airplanes still have wings, ailerons, and an engine; and the Internet uses http protocol over a network and displays text and graphics in a browser. The basics of these technologies have been in place from the beginning, and little about those basics has changed since their introductions.

This is not to say these technologies don’t have room for improvement. Maglev cars with minimal friction, personal jetpacks, and neurological implants for Internet connection are just around the corner. Plan on it!

The Principle of Minimum Adequacy comes into play because most successful technologies fulfill their functions well to begin with, which is why they are successful. Major changes are not needed. If changes are needed (and possible), people make them rapidly. Occasionally you get the jet engine thirty or forty years after the first flight, and the technology gets a big boost. The remaining changes are mostly refinement, adaption to taste, and marketing hype.

The same principle can be applied to the use of XML for document publishing. Text is structured in tags, style sheets transform the tagged content to format-specific markup, then a rendering device such as a browser or FO processor creates the formatted output. The results are satisfyingly fast, consistent, and reasonably cheap (well, sometimes…). And it has been that way for about twenty years.

But, for me, the adoption of XML for publishing has not yet reached a jet engine type of transformation, something that gives it value far beyond its basic function of getting people (or content) from here to there. It seems to me that one of the great powers of XML is to free information from being merely text on a page, and to give it other kinds of roles. And this is precisely where I see the adoption of XML for publishing to have slowed down, per the Principle of Minimum Adequacy. XML fulfills its role adequately as a vehicle for publishing text, so the information remains text on a page or browser window.

But I do see the occasional example where XML intended for traditional publishing straps on a jetpack and takes on greater value. Here are a few examples or ideas I am aware of:

  • Integration with software and testing tools: XML for a command line syntax not only gets published in requirements and user documentation, but is used to form command line test cases for testing the software.
  • Accessibility to Web 2.0/feedback: Chunks of text are structured in a way that allows the audience to add responses to specific pieces of information.
  • Dynamic publishing: A field troubleshooting procedure includes logic for conditional flows and structures for the user to indicate procedural choices. The text changes based on choices made during execution of the procedure.
  • Automated generation of graphs and flowcharts based on text: Using SVG, well-structured XML provides source for graphics or even animations.
  • Customized, self-documenting software and hardware: Software code includes comments; these comments are harvested by publishing software, and documents are assembled for only the software modules that are present at a given installation.

These are not original ideas – but they are ideas I don’t see discussed often in publication scenarios. I’m looking for reinforcement of my technological optimism here.  I am curious how much people are leveraging their investments in XML, and what kinds of ideas are out there for freeing information from its traditional role as text on a page.

How about you? Have you heard of any interesting applications for XML-based text, or do you have thoughts for unusual applications? I’d love to hear about it.



Extracting deliverables from DITA

August 18th, 2010 by Sarah O'Keefe

In this webcast, Sarah O’Keefe of Scriptorium surveys DITA’s publishing options and weighs their practical implications.



The mission of technical communication

August 17th, 2010 by Sarah O'Keefe

The mission of technical communication is light.

The developers build the cathedral. The technical communicators bring light into the building so that it can be used.

20070513 Toledo: cathedral, int, rose window

Rose window in Toledo // flickr: wentzelepsy

What type of light sources should be used? Halogen? Sunlight? LEDs?

How should the light source  be handled? Chandeliers? Stained glass? Floodlights?

In technical communication, external constraints help to determine how information is delivered. Here are a few typical ones:

  • Medical device documentation is regulated, so the documentation must meet the requirements of the regulatory body.
  • The form factor of the product can drive documentation decisions. (A printer with a tiny, one-line screen versus a printer with a large, color touchscreen.)
  • A product that is downloaded (as opposed to purchased in a box) will require documentation that accompanies the download or is available online.


All your followers are belong to us?

August 13th, 2010 by Sarah O'Keefe

In our latest hiring round, I’m seeing something new: candidates with existing social media networks. If we hire one of these candidates, we will need to figure out how social media participation as a Scriptorium employee can co-exist with established personal blogs, Twitter accounts, and the like.

Blogs

Blogs seem like the easier problem to address, as we expect consultants to blog at least occasionally, and generally about work-related topics. I think we can fairly easily define a set of blog topics that need to go on the Scriptorium site; other content can stay with the personal blog. For example, the following would go on our site:

  • Trends in technical communication
  • Ideas or opinions sparked by work

10 ducklings!

Twitter

Here’s where things are going to get dicey. The “cleanest” approach would be to have a new consultant create a new Twitter account and use that for professional tweeting. But the loss of existing followers makes that impractical at best.

It is reasonable for Scriptorium (or any employer) to make demands about an existing Twitter account as a condition of employment? Can we afford not to do so?

I think our policy will have to be a variation of:

“People will know you work for us, so you are representing us even in the social media channels that you established before coming to work here. Therefore, you need to be careful about what you publish.”

Next interesting question: If you are in technical communication and we hire you, I expect that you will use your existing Twitter leverage to support Scriptorium’s social media efforts. Is that reasonable?

What about a separation?

Blog posts written for scriptorium.com on company time are clearly our intellectual property. But, what if someone leaves the company? Do we keep their posts on the site? (So far, we have.) Do we keep their name on the posts or change it to a generic “posted by Scriptorium”? (I can see the arguments for either approach—I’m especially concerned about potential confusion if a former employee’s name is prominent within our blog archives.)

What if a departing employee wants to move or copy their posts to a personal blog?

Am I just borrowing trouble here?

I can’t find a precedent.

Sometimes, being out on the bleeding edge hurts. I looked around for recommendations in this area and couldn’t find anything. There are, of course, social media policies, but they’re mostly about employees who start to blog, how employees might tweet, and keeping Facebook profiles relatively family-friendly. I couldn’t find anything how existing social media activity might be assimilated (uh-oh) with corporate social media activity.

There is an interesting article from 2009 about Jeremiah Owyang, who left Forrester (big analyst firm) to become a partner in the Altimeter Group. In summary: He used social media to build his industry visibility and thus his personal brand, which then gave him options outside the friendly confines of Forrester.

The resumé of the future

Will we see resumé headings like this?

Joe Consultant
@twitterhandle (1,200 followers)
blog: joetheconsultant.example.com (10,000 unique visits per month)
1911 Evans Road, Cary, North Carolina 27513
phone: 919-481-2701
email: joetheconsultant@example.com

What are your thoughts?



Balancing user advocacy and corporate responsibilities

August 10th, 2010 by Sarah O'Keefe

Anne Gentle, in the post Writing Engaging Technical Documentation, says this:

I love it when I hear people say, “I no longer work for development. I work for the user.” They say it with disruption and evolution in their hearts and minds. They fully intend to serve the user the best they can.

Anne has a lot of experience with open-source projects, and I can see where this perspective might work in that area. But I feel that the “user advocate” position is problematic in commercial operations for a variety of reasons:

  • Conflict of interest. Unlike an ombudsman or a guardian ad litem, technical communicators are not explicitly assigned to represent users. If you are on a corporation’s payroll, it’s impossible to be a pure user advocate. (I wrote about this in a post, Web 2.0 and Truth, back in June of 2008.)
  • Isolation. Claiming the user advocacy role sequesters technical communicators from others within the organization. What about quality assurance? What about customer support? Are they not user advocates as well? Everyone, including developers, should have user advocacy as part of their role. When technical communicators claim this role, the implication is that others are not user advocates.

The biggest problem, however, is that what’s best for the user may not always be what’s best for the employer organization. Technical communicators need to balance those competing priorities.

And that led me to a chart:

Tech comm needs to balance strategic/tactical and corporate versus user roles. User-generated content is all user and tactical. Press releases are all tactical and corporate. Third-party books could be user/strategic.

I think that technical communication needs to balance user advocacy against corporate positioning. It also needs to balance tactical information (how to do something) against strategic information (why to do something). Here are some other types of communication:

  • White papers. Usually conceptual information, with a distinct whiff of corporate positioning.
  • Press releases. “Yay, us!”
  • User-generated content. How do to something; highly specific; usually less conceptual.
  • Customer support. Answers the customer’s question; may have some conceptual components.
  • Third-party books. Very user-focused, in-depth on concepts. (You could convince me that some third-party books lie along the X-axis toward users instead of way up in the top right corner.)
  • Bad tech comm. The writers didn’t or couldn’t take advantage of their inside access.


Retail therapy for tech comm (and I don’t mean shopping)

August 4th, 2010 by Alan Pringle

“She’s stupid.”

That’s what a shopper recently said about a coworker’s daughter, who is working a part-time retail job. The daughter had been explaining the store’s coupon policy to the customer, who didn’t believe the information. A manager came over and then gave the customer the same information. “She’s stupid,” was the customer’s response to the manager.

As a former retail salesperson and bank teller, I cringe when I hear stories like that. Even though it’s been a few decades years since I worked those jobs, the lessons I learned in customer service (and dealing with difficult situations and people) still serve me well today.

Treating a customer with courtesy—even when said customer may not be behaving in a way to deserve courtesy—is an essential component of good customer service. Courtesy itself can help defuse tension.

That rule applies to our work in technical communication, too, even if we’re not running multiple drive-through lanes at a bank. (Yes, I did that.)

When you’re creating technical content, you have multiple “customers” to consider, and you can offer courteous service to those groups in different ways. Some of these customers include:

  • Product developers. Getting information from product developers can demand diplomatic skills that rival those of ambassadors. In Technical Writing 101, Sarah O’Keefe and I offer “(Almost) 30 ways to get information from developers” (pp. 84–85). Many of those suggestions reflect common courtesy, including number 4: “Be respectful of the developer’s time and other commitments. Try to group your questions instead of interrupting her constantly.”
  • Writers, editors, and other tech comm team members. Projects go a lot more smoothly when a department works as a team, and there are lots of ways you can show courtesy to your coworkers and make work more pleasant for everyone. For example, following your department’s style guide does a lot more than just create consistent content—adherence to the guidelines demonstrates that you are respectful of others’ schedules. They don’t have to take the time to point out or clean up the inconsistencies you created. Also, when you make a commitment about delivering a draft, attending a meeting, or whatever, you follow through. Think about how many times service providers have disappointed you by not keeping promises. That can put things in perspective when you don’t follow through yourself.
  • End users. Your end user is the most important customer you have, so all the good customer service you’ve offered developers, writers, editors, and others in your organization is ultimately for that user. You also treat your users well by writing to the correct audience level; writing content that is above the technical level of the product’s users is the equivalent of saying,  “You’re stupid.” (If  retail salespeople call customers stupid, they get fired, and rightfully so.) Making product information available in audience-appropriate formats is another way to offer good customer service, as is making content accessible to all users.  Keeping end users happy is essential. Losing them means your employer loses demand for its products, which in turn means your employer may no longer have a need for your services.

Even though my job titles and responsibilities have changed a lot over the years, the lessons I learned about good customer service and courtesy from my first jobs still ring true today. No amount of technology, single sourcing, or structured authoring has changed those basic rules.

P.S. Here’s the follow-up to the story about the customer who called my coworker’s daughter stupid. The husband of the customer was so mortified by her comment, he asked her to go to the car. He then apologized and paid for the transaction.



Using Ant to find a needle in a haystack

July 30th, 2010 by Simon Bate

Many content management systems (CMSs) take over the responsibility of file naming. For the most part, this is fine and is actually necessary for maintaining cross-references and conrefs within the CMS. When you use the CMS to build a DITA map, the CMS uses its own names in the <topicref> elements. In the final output, all the links work and it’s not really important that the file names aren’t human-readable; they don’t need to be.

Except in the case where your users (or your Help system) requires some files to have specific names. Then you’re up the proverbial creek.  Or are you?

Here’s a situation I encountered.  The DITA topics for a help system are managed in a CMS, which uses its own file names when exporting files. I’m customizing the DITA Open Toolkit HTML Help transformation to create a CHM file. The help project file (HHP) refers to an HTML file with a specific, static file name (a client requirement). I’m using the DITA OT, so I’m using Apache Ant to build the help. How can I find the specific HTML file (with a CMS-generated name) and make it available to the HHP file with a static name?

The solution involves the ID attribute for the source topics.  The dita2xhtml transform preserves the source topic’s ID (“my_topic_id” in this example) in an HTML <meta> tag:

<meta name=”DC.Identifier” content=”my_topic_id”>

With a combination of the <copy> task, the <contains> selector, and the
<mergemapper> file mapper, I can get what I need:

<copy todir=”${output.dir}” overwrite=”true”>
   <fileset dir=”${output.dir}” includes=”**/*.html”>
      <contains text=’name=”DC.Identifier” content=”my_topic_id”‘ casesensitive=”no”/>
   </fileset>
   <mergemapper to=”MyTopic.html”/>
</copy>

The <fileset> with the <contains> selector finds the HTML file containing the string I’m looking for (provided the ID is unique). The <mergemapper> then tells Ant that the copy of the file must have the static name “MyTopic.html”.

Now a target file with the appropriate name exists and the HTML Help Compiler can run without complaining.



A contrarian view of the future of publishing

July 29th, 2010 by Sarah O'Keefe

Based on a quick Google search, things don’t look too hot for publishing:

Google search for, in quotes, death of publishing results in 104,000 hits

What’s dying, though, is not publishing itself but the current model of distributing books. Unfortunately for traditional publishers, this also destroys their current business model, which was based on their effective monopoly on distribution channels.

Traditional publishers are supposed to provide the following:

  • Gatekeeping—a quality filter to ensure that high-quality (or at least commercially appealing) content is published
  • Distribution—the ability to have a book available in lots of bookstores
  • Marketing—making sure that potential buyers find out about the book
  • Production—turning a writer’s raw manuscript into a finished product, which includes book design, cover design, editing, and more

What happens in a world that includes print-on-demand and ePub?

Gatekeeping

The publisher’s brand position becomes critical. Publishers that have a reputation for excellence (such as O’Reilly for technical content or Penguin for classics) can credibly claim to be gatekeepers in their focus areas. Huge, unspecialized publishers who are mostly interested in finding The Next Dan Brown, The Next Nora Roberts, or The Next Nicholas Sparks? Not so much.

Distribution

The value of widespread distribution isn’t zero, but it’s headed that way.

Marketing

Authors are expected to do their own marketing. Big publishers can arguably help A-list authors with book tours and the like, but A-list authors aren’t the ones who need marketing help! For the mid-list or niche authors, marketing falls entirely on the author—this is not new.

Production

Oddly enough, this is where I see a big opportunity for publishers. It’s just not the one they seem to be pursuing. Publishers are currently fighting a rearguard action that amounts to arguing that “paper is better” and trying to get people to stay with paper. They are also lurching toward creating eBooks.

There are a couple of strategic problems here: For starters, the vast majority of publishing workflows are built around print production with eBooks as an afterthought. There’s still a lot of friction and manual labor in creating ePub or Kindle formats.

But I think that publishing houses need to focus on production to survive. There’s value in producing a nice, shiny ePub—and right now, it’s difficult. The tools and the workflows will improve, and ePub production will be a commodity. That’s not where the publishers should go.

The business opportunity for publishers is in creating the next generation of content—at this point, most eBooks are little more than print-based pages moved to digital page-equivalents. As we move beyond that very basic approach, I expect to see:

  • Books that include live puzzles. Dorothy Sayers wrote several books in which puzzles or ciphers were an integral part of the mystery. Imagine an eBook edition where you cannot proceed until you actually solve the puzzle, instead of simply reading past it.
  • Books that include computer programs. Neal Stephenson’s Cryptonomicon contains Perl code. In the digital version, you should be able to execute the code and see what happens.
  • Books with soundtracks. Imagine the emotional effect of a Nicholas Sparks book with the tear-jerking music from the movies included…on second thought, don’t. But I can easily envision a book that describes an army on the march including sound effects.
  • Poetry books that include audio with the poetry being read.
  • Books with embedded video, which seems especially useful for technical content.
  • Mixed media. Content that includes audio, video, text, graphics, links, and interactive components to create something is more than a book. We have this sort of experience already with video games, but what if those techniques are applied to telling stories and to nonfiction or technical content?

This is where the opportunity lies. Authors will need help to create content that uses a variety of media. And I believe that consumers will be willing to pay for these interactive content experiences (I need a better name for them).



The role of the gatekeeper is changing

July 27th, 2010 by Sarah O'Keefe

Thanks to Peg Mulligan for hosting my guest post at her blog Content for a Convergent World. I wrote about the evolving role of the gatekeeper and the implications for technical communicators. Read the whole thing.



Tech Tips: Quick Word to DITA table conversion

July 21st, 2010 by Simon Bate

The other day I had to convert a large table from Word to DITA. I started looking at Word XML output and thought about transforming it with XSL (which I have done in the past), but that seemed to be too much trouble for this document. Then I remembered a technique an old SQL coder showed me for loading large amounts of data into a SQL table.  I realized this technique could be readily adapted to DITA.

The solution hinges on two great behaviors in Word and Excel (or OpenOffice.org Text and Spreadsheet).  First, if you copy a table from Word to Excel, the table columns and rows populate columns and rows in Excel. Secondly, when you copy rows and columns from Excel to a text document (or, more precisely, an XML editor in text mode), the text in each row is taken as a single line of text.

Now comes the fun part: in Excel you can add columns before, between, and after the original table columns. In those new columns you can add DITA (or SQL) markup (such as “<row><entry>”, “</entry><entry>”, or “</entry></row>”) and quickly duplicate that markup over the length of the spreadsheet (by dragging the cell’s drag handle to the bottom of the table, or double-clicking the handle).

Thus, you can copy a table from Word into Excel, add new columns between the columns from the original table, add DITA markup in those columns, then cut and paste the table into your XML editor. Voilá, you have the body of a new DITA table.

All you have to do is add the appropriate <table>, <tgroup>, and <tbody> elements around the table contents and you’re done.

With a bit more thought, this technique can be used to add all sorts of markup to text as you convert it to DITA. How could you apply this technique?



Scriptorium Publishing | Post Office Box 12761 Research Triangle Park, NC 27709 | (919) 481 2701 | info@scriptorium.com