Skip to main content

Author: Sarah O'Keefe

XML

The ABCs of XML

STC Intercom, September/October 2009

XML is rapidly becoming part of the required knowledge for technical communicators. This article discusses the three most important reasons that you should consider XML: automation, baseline architecture, and consistency.

Download the PDF PDF file (144K)

Read More
News

XML 101

My latest XML Strategist article, “The ABCs of XML,” is available as a PDF file (144K). This article was originally published in the September/October 2009 issue of Intercom.

The technical side of XML is not much more difficult than HTML; if you can handle a few HTML angle brackets you can learn XML. […] If […] you don’t like using styles and
prefer to format everything as you go, you are going to loathe structured authoring. 

Just trying to make sure that there are no surprises. The article itself is a very basic introduction to the principles that make XML important for technical communication: automation, baseline architecture (sorry…I had problems with B), and consistency.

Read More
Opinion

A strident defense of mediocre formatting

In addition to a gratuitous (and entertaining) swipe at “noisome” DITA “fanboys,” Roger Hart argues that we need to reconsider the disadvantages of automated formatting:

The thing is, [separation of content and formatting has] all been taken rather stridently to heart in certain quarters, leading to a knee jerk reaction whenever author-controlled formatting/pagination/lineation is mentioned as anything other than bleak, sulphurous devilry. This is twaddle. […]

Uncertainty in meaning is anathema to user intelligibility. If we’re going to make sure we’re not writing poetry, there’s definitely value in having poetry’s level of control over semantic blocks.

Of course, it’s fully possible that this is an expensive distraction.

Possible? It’s definitely expensive. It’s possible that it’s a distraction.

I think Hart perhaps unintentionally put his finger on the real issue: value. How much value (in the form of improved comprehension) is added to a technical document when you are able, in the words of commenter Brian Harris, to “lovingly handcraft” each page?

How much value (in the form of cost avoidance) is added to an organization when you are able to spit out a reasonably formatted document in a few minutes?

Actually, I have a different question. How far should we take this argument? Here’s an example of the pinnacle of handcrafting:

Book of Kells image
Can we all agree that this might perhaps take handcrafting a little too far?

Compared to the Book of Kells (above), the Gutenberg Bible looks quite pedestrian:

Gutenberg Bible image

You can just imagine the scribes with their quills, lapis, gold leaf, and other implements muttering, “That Gutenberg and his noisome fanboys. He can’t even render two colors without our help. Poser. It’ll never last.”

Formatting automation removes cost from the process of creating and delivering content. For technical documents that change often and are perhaps delivered in multiple languages, it removes a lot of cost. Let’s assume that handcrafted pages can improve ease of reading and comprehension with careful copy-fitting and adjusted spacing (Hart’s article mentions “headings, line breaks, intra-word, etc”). This increases the cost of the content.

What happens when content is expensive? Fewer people get to see it.

Books in Europe went from 50000 before Gutenberg to 12 million 50 years later.

I think we can all agree that e-books offer none of the typographic sophistication in question here. Bill Gates (yes, that Bill Gates) wrote in 1999:

It is hard to imagine today, but one of the greatest contributions of e-books may eventually be in improving literacy and education in less-developed countries. Today people in poor countries cannot afford to buy books and rarely have access to a library. 

Essentially, we can produce documents inexpensively and give more people access to them as a direct result of lower cost, or we can climb on our typographic high horse and whine about word spacing.

I’m with the noisome fanboys.

Read More
Conferences

Got plans for May 2010?

After my summer of complaints and criticism of STC and its various issues, I was more than a little surprised to be asked to manage the Design, Architecture, and Publishing track for next year’s STC Summit.

Hoist on my own petard (my obsession with Wordnik continues)…what could I do but agree. Or, go into exile.

Several of the other conference organizers are people I know quite well:

  • The author of Managing Writers: A Real World Guide To Managing Technical Documentation, Richard Hamilton, is the track manager for Managing People, Projects, and Business. He knows his stuff.
  • The principal of UserAid, Paul Mueller, is track manager for three (THREE!) tracks: Education and Training, Web Technologies, and Emerging Technologies. He’s also the Deputy Chair of the conference. (private note to Paul: I take it you were not able to retrieve the goat pictures. Sorry about that.) Another excellent choice.
  • Ant Davey of the UK and Ireland chapter has the Communication and Interpersonal Skills and Professional Development tracks. I’ve worked on STC-related matters with Ant, and he’s a great choice for this track.
  • Rachel Houghton, Program Chair. She did great work on last year’s conference.
  • Alan Houser, conference manager. You may remember him as the guy who retrieved David Pogue from a poorly timed bathroom break during the opening session. I’ve known Alan for many years, and I expect another well-organized event, in which he solves the inevitable emergencies with typical aplomb.

(I’m sure that the other track managers are excellent as well, but I don’t know them personally.)

Here is the description of the Design, Architecture, and Publishing track:

Choice of appropriate design and architectures can improve the efficiency, usability, and quality of an organization’s technical publishing. This track explores issues in information design and system architectures for publishing, with particular emphasis on systems and solutions for organization-wide publishing. Suggested session topics include:

  • Visual communication, integrating text and graphics, page layout
  • Single-source publishing, for multiple delivery formats, multiple purposes, and multiple audiences
  • Methodologies and solutions for content management
  • Comparing and selecting delivery formats
  • Issues in structured authoring and publishing, including migration, design, and deployment
  • XML-based publishing
  • Using industry-standard publishing architectures, such as DITA
  • Accommodating localization workflows in the publishing process
  • Moving unstructured content to structure

And now I need your help in two areas:

  1. Submit your proposals. The quality of the conference is determined by the quality of the presentations. And that, of course, is determined by the quality of the proposals submitted. Please send in your best stuff. I suppose you can look into the other tracks if you must.
  2. Help review proposals. I need two or three people to help out in reviewing conference proposals in this track. I’ve done this in the past; it’s a relatively limited time commitment. You will be asked to read lots of proposals and evaluate them, probably in mid-October. Along with reviewers, I will eventually generate a list of recommendations for which proposals to accept. If you have significant expertise in topics in this track, and especially if you do not intend to submit a proposal of your own, please consider volunteering to help with this effort.

Some notes on this year’s process:

  • The deadline for proposal submission is October 5, 2009 at 10 a.m. Eastern time.
  • This is a direct quote from the conference page: “With the smaller number of sessions (for the most part) only one proposal per speaker will be accepted.” (You can still submit multiple proposals, but do not expect to have more than one accepted.)
  • Two speaker references are required (unless you have presented at this conference in the past four years, in which case we will review your evaluations). I personally intend to put a significant weighting on previous highly rated speaking experience.
  • In 2009, sessions were recorded. I assume this will happen again.
  • The conference is May 2-5, 2010, in Dallas, Texas.

Get started with a proposal

If you have questions, leave a comment or contact me. I look forward to seeing lots of compelling proposals.

Read More
News

Liberated type

(or should that be “Liberated typoes?”)

We have opened up free access to two of our white papers:

  • Hacking the DITA Open Toolkit, available in HTML or PDF (435 KB, 19 pages)
  • FrameMaker 8 and DITA Technical Reference, available in PDF (5 MB, 55 pages)

These used to be paid downloads.

Why the change of heart? Most of our business is consulting. To get consulting, we have to show competence. These white papers are one way to demonstrate our technical expertise.

(By this logic, our webcasts should also be free, but I’m not ready to go there. Why? We have fixed costs associated with the webcast hosting platform. Plus, once we schedule a webcast, we have to deliver it at the scheduled time, even if we’d rather be doing paying work. By contrast, we can squeeze in white paper development at our convenience.)

What are your thoughts? We are obviously not the only organization dealing with this issue…

Read More
News

Is this thing on?

If you are reading this, then we have succeeded in migrating our web site over to WordPress.

Of course, the process of managing our own content always takes a back seat to working with our customers’ content, so the process took longer than you might expect. 

We did learn a couple of things, most of which should sound awfully familiar if you are working on your own content strategy:

  • It’s not until you try to move into a new system that you recognize all the mistakes you made the previous system.
  • PHP stands for Picky Hypochondriac Programming. I had several cases where code absolutely refused to work for no apparent reason. I had the resident PHP expert (Simon) look it over. Eventually, I gave up and retyped the code, and then it worked.
  • Learn to work with the tool and not against it. I have to credit a former coworker, Bruce Bicknell, for this little gem, which he originally applied to Word versus FrameMaker. When moving from Dreamweaver-based HTML to WordPress, take some time to learn best practices for WordPress. Don’t try to impose your existing  Way of Doing Things onto the new system. It’s inefficient and it probably won’t work.
  • Content migration is always awful. To transfer our blog, I found a blogger-to-WordPress converter. That worked pretty well, except that a couple of posts now have my name on them even though I didn’t write them. Transferring comments was a travesty that involved the support people at Haloscan (helpful) and cleaning out random comment triplication (gross manual labor).

But I hope you like the new site and blog. Please poke around and leave us feedback.

Read More
Webinar

Learn DITA and XML at your desk

For August and September, our webinar schedule is as follows:
DITA 101, August 18 at 11 a.m. Eastern time
Participants will learn about basic Darwin Information Typing Architecture (DITA) concepts, the business case for implementing DITA, and some typical uses of DITA. This webinar is ideal for those who are considering a move to structured authoring based on the DITA standard. Register
Demystifying DITA to PDF Publishing, September 10 at 11 a.m. Eastern time
When a company implements a DITA-based workflow, the most difficult technical obstacle is often setting up a PDF/print publishing workflow. This session discusses the advantages and disadvantages of using the DITA Open Toolkit, FrameMaker, InDesign, and other options to create PDF output from DITA content. Basic familiarity with DITA, Extensible Markup Language (XML), and related technologies is helpful but not required. Register
What Do Movable Type and XML Have in Common?, September 22 at 11 a.m. Eastern time
The invention of movable type changed the economics of information by making the process of copying a book by hand obsolete. More than 500 years later, XML seems to be doing the same to desktop publishing. But where movable type changed the economics of a mechanical process—creating printed 
copies—XML changes the economics of content authoring, formatting, and customization. This webinar takes a look at how publishing technologies revolutionize the way people consume information and how those technologies affect authors. Register
Each webinar is $20. 
During the sessions, you can interact with the presenter and other students through the chat interface or the audio connection. There is a question-and-answer session at the end of each webinar. The Q&A is not included in session recordings, which are available for download later. Participants in the sessions receive a free recording.
To register for these webcasts, or to purchase recordings of past webinars, go to our online store.

Read More
Reviews

Let the conversation begin

Conversation and Community book cover imageConversation and Community: The Social Web for Documentation (XML Press, ISBN: 9780982219119) by Anne Gentle provides technical communicators with a roadmap for integrating social media — blogs, wikis, and much more — into their content development efforts. This is critical because, as Anne notes in the preface, “professional writers now have the tools to collaborate with their audience easily for the first time in history.”
Anne provides overviews of all the major social media concepts — from aggregation to syndication, wikis, discussion, presence, and much more. But it is Chapter 3, “Defining a Writer’s Role with the Social Web,” that will make this book a classic. Here, Anne lays out a detailed strategy for determining whether and how to introduce social media in an organization. Consider this:

It’s important to find a balance between allowing an individual’s authentic voice to speak on behalf of an organization and the requirements of institutional messaging and brand preservation. […] It’s also possible that you are ahead of the curve and need to help others see ways to apply social technologies for the company.

She goes on to explain just how to accomplish these things.
Wikis and blogs each get a chapter of their own, in which Anne discusses how to start and maintain these types of environments.
After reading so much of Anne’s work on her blog, it’s a bit odd to see her writing on paper in an actual book. The feeling that I’ve wandered into the wrong medium is augmented by extensive footnotes, most of which point to web site resources, and the many examples of web-based content (such as videos or interactive mashups). However, it’s likely that the book’s target audience is more comfortable with paper.
Conversation and Community: The Social Web for Documentation provides an excellent introduction to wikis, blogs, forums, and numerous other social media technologies for the professional content creator. There is valuable (and perhaps career-preserving) information about how to develop a strategy for user-generated content that is compatible with your organization’s corporate culture.
If you think that community participation in your documentation is coming soon, read this book immediately. If you think that it’s not coming, you’re wrong, and you especially need to read this book.
Resources:
[Disclosure: I reviewed an early draft of this book. I have met Anne in person a few times and we have ongoing email and blog correspondence.]

Read More
Opinion

Manifest(o) destiny

Tom Johnson issues a polite manifesto about moving STC’s publications online. (I am distracted by the use of the word manifesto and more so by its Wordnik page. I’d like to blame this problem on the Internet, but I’m pretty sure that the Internet just lets me manifest (!) my attention problem more easily. OK, I’m banning “manifesto” from the rest of this post.) Here’s Tom:

When I hear these discussions, it blows me away because I can hardly believe what I’m hearing. I admit, the look and feel of paper can provide a comfortable reading experience if you’re immersed in a 200 page novel lying on your bed on a rainy day. But the Intercom and other professional magazines or journals are not novels. With professional publications like these, the online format better matches the reading behavior of the audience. In fact, online formats provide more than a dozen advantages that print formats lack, including everything from interactivity to portability, feeds, metrics, multimedia, and more.

I am fundamentally in agreement with Tom’s manif….er, declaration of principles. For balance, I would like to address the advantages of printed content over online content. They include the following:
Higher resolution
The printed page generally has a resolution of 600 dpi (printed at the office) or 1200 dpi (printed on a printing press). On-screen, you have a resolution of around 100 dpi. Therefore, printed content has a resolution that’s around 36x higher than screen content. (100 dots per inch is 100 pixels times 100 pixels, or 10,000 pixels per inch. 600 dpi is 360,000 pixels per inch.)
There are other technical issues (such as light being absorbed/reflected on paper versus being emitted from a screen) why text on paper is easier to read than text on screen.
Batteries and electrical power
Paper doesn’t require batteries or electricity to operate. This matters most for toilets and airplanes. And airplane toilets.
Universal access format
Once you have a paper copy, you can access your data. The same thing is not necessarily true online. For instance, you can have browser compatibility issues with HTML, problems with PDF versions, digital rights management obstacles, problems with logons for private content, and so on.
Better layout
Print (and PDF) give you sophisticated options for layout that go far beyond what you can do online with HTML.
Familiarity
As a society, we have hundreds of years of experience with books and magazines. This is not true for online content.
Engaging your senses of smell and touch
I think this issue is often overlooked when evaluating print versus online. The physical experience of holding a book, the smell and feel of high-quality paper, the sensation of pages sliding past your fingers as you turn the page — all of these are lost in the digital experience.
Authority
Printed content conveys authority in a way that web-based content does not. I believe that this is related to some of the factors I’ve outlined above. We know how to evaluate printed publications for quality — we look for attractive design, glossy paper, high-impact color, and so on. There’s a reason why the cliché is that you shouldn’t judge a book by its cover. We do. (See also: “Understanding Judgment of Information Quality and Cognitive Authority in the WWW,” Soo Young Rieh and Nicholas J. Belkin, PDF link)
But even though I can make a decent argument for the merits of printed publications, Tom is absolutely right, at least as it pertains to STC, when he says that:

Any organization or company would be crazy not to convert their paper-based magazine, journal, or newsletter into an interactive online format. 

He’s laid out (cough) the arguments for online content in some detail, so I am going to focus on something a little different. I’d like to take a look at the business case for moving publications from print to online. I do not have any useful information from STC on the actual costs, so I’m just going to make some estimates. (I would be happy to get the official cost information. Anyone?)
We have around 11,000 members, so let’s assume a print run of about that. Further, let’s assume that printing runs about $2 per copy (?) and postage about $1 (I have no idea). That gives us an estimate of $33,000 in direct printing and postage costs per issue. Multiply that by 10 issues per year, and you get somewhere around $330,000 in direct printing and postage costs per year. I am leaving out international postage and other complicating factors. There’s also the fact that STC is collecting additional funding for sending printed publications.
In addition, each printed issue incurs design and layout costs. Best guess? 100 hours per issue at oh, $50 per hour. So, that’s somewhere around $50,000 per year in layout costs.
Some things I am not taking into account:
  • Initial magazine design. My 100-hour estimate is for flowing content into an existing design, placing graphics, generating the table of contents, and doing print production.
  • Editing.
  • Working with recalcitrant authors.
  • Planning the magazine content/setting the editorial content.
  • The income side of the equation — fees specifically for international postage, for example
What would the equivalent costs look like for an XML or HTML-based workflow?
We eliminate printing and postage, so we save $330,000 per year. We probably save on the layout costs as well because publishing into HTML is so much less work. Total cost savings? Conservatively, it’s $330,000, if we assume no cost savings from reduction in layout work. (Note: If we continue to publish a PDF version of the magazine, we must keep the PDF layout costs as a line item and add a smaller amount for HTML-based publishing so maybe $300,000.)
I have been told that STC will lose advertising income if the magazine goes online only. I would agree that advertisers will pay less for online advertising as opposed to print advertising, but surely the advertising income would not drop all the way to zero. Let’s assume, however, that it does. The best estimate I have for advertising income is $143,159 (from Paul Bernstein’s detailed cost breakdown on the STC Ideas forum, accessible here to registered members of the forum).
So, even if advertising drops to zero, we have a net positive of $150,000 from moving online. Implementing an XML or HTML-based magazine for the first time will cost a lot less than that. Therefore, the return on investment appears quite compelling.
You should be aware that I have no confidence in any of the numbers I have compiled here. I do not know the following with any certainty:
  • Intercom print run
  • Cost per printed copy
  • Cost of postage
  • Income from advertising
However, based on my experience in the industry, I think that the general ballpark figures are probably accurate. I would be delighted to update this post if someone can give me the real numbers.
So, Tom has laid out the argument for moving magazine content online based on quality. I have given you the argument based on cost, along with the reasons why you might prefer print.
What do you think?

Read More
Opinion

Authoring tools do matter

“I can write in anything.”
“The tool doesn’t matter.”
“I can learn any new tool.”

Most of the time, I agree. But then, there are the exceptions.

One of our customers is using FrameMaker to produce content that is delivered in HTML. (They use structured FrameMaker, generate XML, and then transform via XSLT into HTML.) Their rationale for using FrameMaker was:

  • The project was on an extreme deadline.
  • The writers already knew FrameMaker.
  • FrameMaker is already installed on the writers’ systems.

All valid points.

But.

We have had a continuous stream of requests from the writers to make adjustments to the FrameMaker formatting. Things like “the bullets seem a little too far from the text; can you move them over?”

FrameMaker is being used as an authoring tool only. FrameMaker formatting is discarded on export; HTML formatting is controlled mainly by CSS. However, even after repeated explanations, we continue to receive requests to modify the FrameMaker formatting.

In this specific case, the authoring tool does matter. Writers are focusing on the wrong set of issues (leading, kerning, print formatting), none of which is actually relevant for the output.

Why are they focused on this stuff? Because they can. It seems to me that moving authors to a WYSIOO (what you see is one option) tool, such as oXygen or XMetaL, instead of a WYSIWYG tool (FrameMaker) would eliminate the obsession with irrelevant formatting.

Read More