Palimpsest
Thursday, June 26, 2008
 
Web 2.0 and Truth
My presentation at X-Pubs was about the impact of Web 2.0 or user-generated content on technical communication. (You can view the presentation at the bottom of this post.)

A phrase I heard repeatedly in reference to professional content was "a single version of the truth," which alludes to the idea that you should only have one instance of any given piece of content.

And that got me thinking. There are many areas of tech comm where this idea makes sense.

User-generated content, though, is in direct conflict with a single, unchanging, objective truth. Wikis, by definition, have content that is constantly evolving.

Furthermore, there's truth and then there's, well, truth. Compare and contrast these two snippets:
"The ABC feature is unusable. Use the XYZ as a work-around."

"You can use ABC to do blah blah. Here's how:
(many annoying steps)"

Which one is truth? Both? More importantly, which one is more useful to the reader?

It takes a brave or maybe foolish corporate technical writer to criticize their own product explicitly. (This, in turn, is probably why third-party computer trade books sell so well. Somehow, I don't see a title like Word Annoyances getting the Microsoft seal of approval.)

But even though technical writers try to act as user advocates, there's a built-in conflict of interest -- technical writers are paid by corporations, not by users.

User-generated content meets a need that corporate technical publications do not (or perhaps cannot). It provides unfiltered, opinionated, and user-biased coverage of technical topics.

Why is there a gap between professionally created technical publications and the end users?

1. Updates can take a long time to get into the official documentation because of lengthy review, approval, and publishing processes.

2. Annotation capabilities are rarely provided to users. If they are, they're usually fairly limited.

3. The documentation is not sufficiently candid.

What are the implications for technical writers?

1. Document publishing needs to accelerate.
2. Online documents should allow for comments and discussion.
3. The documentation needs to be explicit about product limitations and workarounds.

In effect, technical writers need to have more of an editorial voice.

Here is my Web 2.0 presentation:







Notes: Use the arrow keys to navigate through the slides. The first slide may take a few seconds to come up; the presentation file is quite large.

Labels: , , ,


Wednesday, June 25, 2008
 
English lessons
I'm at London's Heathrow airport, getting ready to return home. Many thanks to the organizers of the STC UK and X-Pubs events for wonderful hospitality (special thanks to Ant Davey who picked me up at the airport when I arrived at 6:30 in the morning).

Some observations about my week in the UK:
I had some very interesting discussions with participants at both events, ate a lot of great food, and had a generally wonderful time.

Labels: , , ,


Tuesday, June 24, 2008
 
XPubs: Information integration and the needs of the (product) maintainer
Chris Wood
BAE Systems

Tech pubs managers at BAE, contributor to S1000D standard.

Electronic maintenance (interactive electronic technical manual, or IETM) has been shown to deliver increase in fault finding success, reduction in troubleshooting time, and reduction in maintenance errors. "Fairly comforting"

Market drivers for integrated information...output-based contracts. The Royal Air Force is asking vendors to take on more maintenance activities. The drivers for success for the commercial organization are different from the drivers for the military.

BAE must guarantee that a specific number of aircraft ("platforms") are available to fly at all times. Financial penalties for not meeting those goals.

Offshore commodity outsourcing is putting pressure on the prices that BAE can quote. Price "per page" needs to be on a downward slope.

IETM capability offers an opportunity to integrate support information applications and processes.

ATTAC Contract = a certain number of Tornado aircraft must be available 24x7. BAE is responsible for preflight, postflight, AND other maintenance. Spare parts come out of BAE's budget. Therefore, reducing spare parts "footprint" saves money.

18 million pounds (double that for dollars) over 10 years. Their target is to save more than 18M pounds by including rich data (photos, video, 3D animation), align with actual maintenance activities, tech pubs people on-base as part of integrated engineering team.

Nice example of specific changes in tech docs leading to large cost savings due to fewer returns for repair.

Aha. They improved the official documentation by picking up information that was "plastered on the wall" in the aircraft hangar. In other words, user-generated content!

Information integration...the issue
Too much information, which is necessary and can be integrated, but...who generates it? where? who approves it? who can receive it? is there a recognized authority?

What about information generated by maintenance personnel for use by engineers (the stuff on the wall)? Is there an approval route? How authoritative is it?

In the past, the separation between maintenance and design authority was clear. As the maintenance and design operation moved closer (or become the same in BAE's case), the needed separation of content becomes much more challenging. Does linking from engineering authority content to non-engineering authority corrupt the authoritative content?

What level of authority does information have? Has it been tested? Have is gone through an approval route?

Approved data architecture. The challenge is to define a data architecture that includes all information issues by the design authority for the purpose of operating and/or mainteaining the platform in services, ensuring it is efficient, effective, and safe.

"This is a major content management issue." Indeed.

Many information deliverables go through rigorous approval process, but maintainers have access to other information, too. Official deliverables must be more integrated. Reference data and maintenance procedures come from different places in the organization, but they need to be in alignment. And there are "modifications," which must go in both places.

"This is not a trivial challenge." Yep.

The conflict here is really between data (approved content) and lore (unofficial information about how things really work). The mechanics have the "lore," and need to be persuaded to share it to improve the official documentation over time.

Labels: , ,


Monday, June 23, 2008
 
XPubs: DITA implementation in progress
Chris Hadley of Micro Focus
Noz Urbin of Mekon

Micro Focus
12 writers in four locations
rapidly growing team, but also 20-year company veterans

Content is in XML, written using XMetaL, stored in CVS, DTDs and XSLT developed in-house.

Acquired companies have content in FrameMaker and Word.

Delivery in CHM, HTML, Help 2.

Lots of reuse in places; none in others.

No localization.

Problem very interesting because second generation of XML implementations are beginning. System was developed in-house, which was cutting edge at the time but now showing its age. It is costly to maintain, and the experts have left. Company cannot manage, debug, or improve the processes that exist.

Cannot continue to rely on in-house solutions because of risks of breakdowns. Change was required.

Business case 1: Value to customer
Improve quality of content with reuse, consistency, and accuracy.
Improve search. ("Content is great...when I can find it.")
Publishing to different formats.

Business case 2: Value to business
More accurate documentation with information easier to find means lower support cost due to fewer calls.
Lower cost on development. If support can find answers in documentation, there's no need to ask development team. Development team also uses documentation.
Reducing time on non-writing tasks such as builds means more time available for writing content.

Business case 3: Value to doc team
Replace aging systems and processes before they affect ability to deliver
Sustainable and manageable systems that can work even if there is employee turnover.
Align with industry standards (key reason). Open source expertise is transferable from other organizations, helps with recruitment.

Project started at the end of 2007. They engaged Mekon to help with the analysis of the current situation. By February 2008, they chose a CMS (TriSoft). Engaged Mekon to convert content to DITA. Installed and configured TriSoft. Primary consideration was cost, and they were required to give a budget figure to their board before they knew what they were going to be doing.

Why a CMS? They needed software that was supported externally instead of being produced in-house. Secondary priorities were reuse ("a single version of the truth") and publishing issues (multiple formats and audience, building and testing, link management).

Why DITA? Decision to DITA was very difficult. But the same arguments that applied to externally supported software applied to the DITA issues. Migration will be painful for acquired companies no matter what.

Why Mekon? Industry experience, skills range across the company, can talk about ROI, are independent, they are based locally (except Noz!).

Why the project will succeed. It must. Team is enthusiastic and motivated -- no issues with change resistance and recognized that the old system was unsustainable, already knew XML and XMetaL. Company's charter includes the phrase "debate passionately, get on board." And they did. Buy-in from the executive board was critical. Had to get senior VP to understand, support, and present to the board for approval. Early wins -- they already have demonstrable results. This is the right solution at the right time for this company.

Complexities and pitfalls
Didn't know what the end product was going to be. How do you learn what you need to when you don't know what you don't know? No in-house DITA experience, limited CMS knowledge. Also need to demonstrate results quickly while learning. Couldn't design perfect solution before starting, need to make changes during implementation. Balance between up-front research and management of progress expectations.
Content migration to DITA did not make the original timeline. Big reason for that is the "day job" -- writing content. So the initiative has to take a back seat to getting content out the door. Geographically dispersed team is difficult -- two locations haven't even looked at XML. Don't have a fully developed training structure yet.
The pilot project hasn't quite happened yet.
Preparing for migration is a horrible pain. Don't underestimate the cost of migration.
Publishing is complex, but it's taking time to get to a push-button process and it's not ready yet.

Planning to map to DITA, migrate to CMS. Will run old and new systems in parallel until they're sure that the new stuff is really working.

Need to re-architect content, integrate acquisitions, and refine and improve the process.

Business case for documentation team is very different from business case to board. The doc team's need for CMS needed to be presented in a way that would get approval.

This may not the right solution for everyone, but if it is, "get on with it."

CMS licenses -- some have high up-front cost but drop quickly per seat with bigger implementations. Others have low up-front cost but licensing per-seat doesn't drop much with greater volume. Makes cost evaluation different for different sized organization.

Another very interesting presentation. A rather aggressive timeline and the unique circumstance of lots of high-end skills (like XSLT development) in the documentation organization.

Labels: ,


 
XPubs: XSL-FO for Documentation Formatting
Mike Miller, Antenna House

For starters, XSL-FO is an XML standard.

XSL-FO is "a pagination markup language describing a rendering vocabulary capturing the semantics of formatting information for paginated presentation." (Ken Holman)

Or, as I like to say, "A document layout described in a text file."

XSL-FO is black box formatting. Can't go back and "tweak" the files to fix them. With FO, you're typically talking about a minimum of a couple hundred pages. Much faster to render automatically rather than by hand in InDesign or FrameMaker.

First commercial products in 2001 from Antenna House and RenderX. Also, open source FOP from Apache in 2001. FO successful in the sense that both commercial companies are doing quite well.

FO more successful than any other technical publishing application other than perhaps TeX and FrameMaker. Probably attributable to the availability of open source (free) and trial versions from commercial vendors (free).

XSL-FO is only concerned with visual display of XML data, which means that the FO file has no semantic content, only formatting instructions.

The FO stylesheet specifies:
Advantages:
Antenna House has been personally involved in about 30 different DITA projects.

Most business documents can be formatted automatically as FO. Rule of thumb: "If it's XML, FO can be applied."

Other applications for FO might include faxes, German railway tickets, correspondence from financial institutions and government.

Typesetting is very complex with issues like widows and orphans and hyphenation. Software can handle this. Human typesetters have been removed from the process, and this shows in amateurish mistakes. But you can use FO to configure something that follows typography rules and give you a professional look and feel.

"Overwhelming benefits" of using FO. Which begs the question: "Why aren't more people using it?" A slide with the benefits of XML showing The Usual (cost, time-to-market, less redundancy, standards-based, localization for cost justification, etc.).

People who use FO: auto manufacturers, cell phone manufacturers, banks, aerospace, government, military, educational

FO not appropriate for documents that are "artistically created."

FO extensions provide support for:
Thus, if you need one of these features, you might get somewhat locked into your rendering engine...the extensions are specific to a particular FO engine.

DITA Open Toolkit reduces complexity of getting set up and produce PDF. Could be configured and producing PDF in "a couple of hours." (Perhaps, but making it look the way you want is going to take a while.) According to Mike, somewhere between a few days and a few months, depending on the complexity of your requirements.

PDF output from DITA
Stages:
Several software components are required -- DITA Open Toolkit provides all the components you need.

Why not FrameMaker or InDesign?
You need WYSIWYG if:
If you need WYSIWYG, you need a layout engine like FrameMaker or InDesign. If you need WYDSIWYN, you need XSL-FO.

On the low end, FO is free with FOP. Antenna House is most expensive at $1250 for stand-alone or server license for $5,000.

FO supports more languages than any other solution currently available.

Solving the real problem:
XSL-FO is delivering on the XML promise. Don't underestimate it.

First question: Flowing text into typesetting engine results in line breaks that will cause readers difficulty. And this annoys him (as a professional typesetter). We want powerful, automated formatting AND the ability to do WYSIWYG tweaks. Thinks there is a role for a WYSIWYG stage after the automation bit.

I've noticed this on the BBC, too. British people ask really pointed questions.

And in response, Mike says that Antenna House has a solution for this where you create INX (InDesign XML) content (4 minutes) and then you can pull it into InDesign (half an hour), and do some cleanup.

Do all the XSL-FO tools cover 100% of the FO standard? "No, definitely not."

Labels: , , , ,


 
X-Pubs keynote: Transforming Legislation Publishing
Brief introduction from Noz Urbina and an overview of the conference from Julian Murfitt. Some X-Pubs housekeeping items, including a flight announcement...
"Should a presentation be boring and sleep deprivation set in, oxygen masks will drop from the ceiling. Please put on your own mask before assisting others."
Hehe.

On to the keynote...John Sheridan, Head of e-Services at the Office of Public Sector Information, National Archives. Eeek, slight problem with slides -- and the presenter just launches right in without them. I bet he's terrified right now, but he looks perfectly composed.

We have slides. "Transforming Legislation Publishing"

Publishing legislation seem dry, but in fact it's quite relevant to the people at large -- and ignorance of the law is no excuse. Legislation documents use XML under the covers. Have been publishing legislation online since 1996 and of course print for a long time before that.

Strengths of their service:
Much of the content they were republishing online had not been touched in 10 years or so. However, they had as many as 500,000 external links (that is, links from other web sites pointing to their content). Breaking links was out of the question because the number of links help their content's Google rankings.

Key performance indicator: two clicks. 80 percent of information should be available in two clicks:
  1. Google search button.
  2. Click link on first page of results.
Their audience is demanding. Challenges included:
Primary motivation to improve usability and accessibility of content. Existing HTML content didn't look contemporary.

Persistent linking, "web continuity", overall 60 percent of links to official information are broken. Their solution to "persist" the 500,000 existing links was to provide redirection behavior, so that every URL resolves either to live content or to the government's archive on the web.

XML is the key to solving these assorted issues.

Trying to "future-proof" their work, especially by providing a way to allow for changing web standards (HTML/web standard may change, but we can keep underlying XML).

Legislative documents are highly structured but also have variations over time. Very difficult to capture in a structure. "Parliament trumps your XML schema." You can't say, "Sorry, but that won't work in our schema, so you can't pass that legislation." Must find the balance between accommodating what's needed and "allowing everything."

They developed Crown XML:
The Crown XML Schema for Legislation provides a full and comprehensive encoding for all United Kingdom primary and secondary legislation. It has been written using the World Wide Web Consortium XML Schema language and is the Government's official and authoritative data standard for legislation. Once a piece of legislation has been enacted or made, it is stored using this Schema format. Schema compliant legislation is available in XML for onward supply to legal publishers and others.
They provide sample documents, which even so cover only about half the possibilities in the full schema.

Users have options for various views of the legislation.

Their work leads to the concept of the web as a platform. Not just providing for users to consume, but also to reuse, aggregate, and combine.

Mixing data...hey, cookie dough!

The government's response to Web 2.0 trends. Government should enable information so that citizens can use the information. Doing so will lead not only to better public services, but also to other services, both commercial and noncommercial.

Problems include culture, rights, licensing, intellectual property, and technology challenges. Information becomes infrastructure and potentially as important as roads and other physical infrastructure. Legislation is widely cited content, which becomes infrastructure for other things. Legislation needs to be addressable with fragment identifiers, so that people can cite specific sections or paragraph rather than an entire act.

Why couldn't lawyers add editorial value to legislation in a wiki-type format. Not a job for the state, but something that could be enabled or inhibited by how the legislation is published. Providing addressable content and using standards would allow for third parties to use the legislation as a starting point for additional work.

They provide Atom (RSS) feeds for new legislation.

Library of Congress is an example of a re-user of UK legislation. UK legislation of interest for comparison purposes. They have a "PDF thing going on." Really wanted access to PDF versions of the information. Subscribe to the Atom feed, and the PDF will pop up there as a link.

Expect reuse for very granular areas...discussion of specific industries or topics. (If mad cow disease were to reoccur, expect footpaths to be closer, and a map could show in real-time what's open and what's closed.)

Providing sufficient flexibility into structure without descending into tag soup.

First question: Is the raw XML available to the public?
Yikes. The presenter hesitates and is quite uncomfortable. Seemed like a harmless enough question but apparently not. The answer is that it's available by subscription -- that is, lawyers pay to get access to it. They must balance between their economics and subscription income. They would like to publish XML; seems to be the direction that public policy is going. But "don't want to spend taxpayers' money to subsidize Lexis-Nexis."

Second question: Would these policies extend to others, like the Department for Transport?
Again, this sounds harmless to me, but appears to be quite controversial. Information produced as a core public task ("which is nowhere defined clearly") is public.

Really, when will government policy help the questioner push his employer into using structure? Interesting. I don't think we'd get that question in the U.S., other than in the negative.

Conferences here are so civilized, with the opening session at 10 a.m. Ahhhhh. Tea and cookies, er, biscuits at the breaks. Luvely.

Labels: , , ,