Palimpsest has moved. Please visit our blog in its new location for the most recent posts from Scriptorium.
Palimpsest
Web 2.0 and Truth
Thursday, June 26, 2008 — posted by Sarah
My presentation at X-Pubs was about the impact of Web 2.0 or user-generated content on technical communication. (You can view the presentation at the bottom of this post.)A phrase I heard repeatedly in reference to professional content was "a single version of the truth," which alludes to the idea that you should only have one instance of any given piece of content.
And that got me thinking. There are many areas of tech comm where this idea makes sense.
User-generated content, though, is in direct conflict with a single, unchanging, objective truth. Wikis, by definition, have content that is constantly evolving.
Furthermore, there's truth and then there's, well, truth. Compare and contrast these two snippets:
"The ABC feature is unusable. Use the XYZ as a work-around."
"You can use ABC to do blah blah. Here's how:(many annoying steps)"
Which one is truth? Both? More importantly, which one is more useful to the reader?
It takes a brave or maybe foolish corporate technical writer to criticize their own product explicitly. (This, in turn, is probably why third-party computer trade books sell so well. Somehow, I don't see a title like Word Annoyances getting the Microsoft seal of approval.)
But even though technical writers try to act as user advocates, there's a built-in conflict of interest -- technical writers are paid by corporations, not by users.
User-generated content meets a need that corporate technical publications do not (or perhaps cannot). It provides unfiltered, opinionated, and user-biased coverage of technical topics.
Why is there a gap between professionally created technical publications and the end users?
1. Updates can take a long time to get into the official documentation because of lengthy review, approval, and publishing processes.
2. Annotation capabilities are rarely provided to users. If they are, they're usually fairly limited.
3. The documentation is not sufficiently candid.
What are the implications for technical writers?
1. Document publishing needs to accelerate.
2. Online documents should allow for comments and discussion.
3. The documentation needs to be explicit about product limitations and workarounds.
In effect, technical writers need to have more of an editorial voice.
Here is my Web 2.0 presentation:
Notes: Use the arrow keys to navigate through the slides. The first slide may take a few seconds to come up; the presentation file is quite large.
Labels: analysis, conferences, web 2.0, xpubs
11:21 AM Permalink | |

English lessons
Wednesday, June 25, 2008 — posted by Sarah
I'm at London's Heathrow airport, getting ready to return home. Many thanks to the organizers of the STC UK and X-Pubs events for wonderful hospitality (special thanks to Ant Davey who picked me up at the airport when I arrived at 6:30 in the morning).Some observations about my week in the UK:
- During conference sessions, you can expect that participants will not ask any questions until the end of the session.
- Questions are often quite pointed, far more so than in the U.S. I've noticed this on the BBC news as well. The question is phrased politely, but the general paraphrase is something like, "Well, that's all very nice, but isn't it true that blah completely undermines what you are saying?" For instance, you might get a question like, "You're really pushing for XML here, but isn't it true that XML isn't actually applicable for many situations?" Um. (Incidentally, participants in Germany frequently challenge the presenter as well. Not that there's anything wrong with that...)
- There is no such thing as a "British" accent. I heard an unbelievably variety of different accents and inflections, including Irish, Northern Irish, Scottish, northern England, southern England. The natives here can place accents with remarkable accuracy. And let's not forget variations from the non-native speakers, such as Germans and eastern European. It's quite fascinating, and for me, at least, some accents are much harder to understand than others. In particular, I found that long sentences were much easier to understand than a quick question. As a result, I was constantly asking service personnel to repeat themselves (they tend to ask short questions like, "checking in?").
- The X-Pubs attendees were mostly men, not too surprising with the emphasis on the defense (defence) industry and aerospace. But quite a different demographic from STC.
- There's been recent discussion about the relative lack of blogging or twittering at STC, but at these events, there was none (except for me). With wireless access clocking in at around $30 per day, it's not that surprising.
- The current exchange rate is really, really, really painful.
Labels: conferences, presentations, stc2008, xpubs
4:26 AM Permalink | |

XPubs: Information integration and the needs of the (product) maintainer
Tuesday, June 24, 2008 — posted by Sarah
Chris WoodBAE Systems
Tech pubs managers at BAE, contributor to S1000D standard.
Electronic maintenance (interactive electronic technical manual, or IETM) has been shown to deliver increase in fault finding success, reduction in troubleshooting time, and reduction in maintenance errors. "Fairly comforting"
Market drivers for integrated information...output-based contracts. The Royal Air Force is asking vendors to take on more maintenance activities. The drivers for success for the commercial organization are different from the drivers for the military.
BAE must guarantee that a specific number of aircraft ("platforms") are available to fly at all times. Financial penalties for not meeting those goals.
Offshore commodity outsourcing is putting pressure on the prices that BAE can quote. Price "per page" needs to be on a downward slope.
IETM capability offers an opportunity to integrate support information applications and processes.
ATTAC Contract = a certain number of Tornado aircraft must be available 24x7. BAE is responsible for preflight, postflight, AND other maintenance. Spare parts come out of BAE's budget. Therefore, reducing spare parts "footprint" saves money.
18 million pounds (double that for dollars) over 10 years. Their target is to save more than 18M pounds by including rich data (photos, video, 3D animation), align with actual maintenance activities, tech pubs people on-base as part of integrated engineering team.
Nice example of specific changes in tech docs leading to large cost savings due to fewer returns for repair.
Aha. They improved the official documentation by picking up information that was "plastered on the wall" in the aircraft hangar. In other words, user-generated content!
Information integration...the issue
Too much information, which is necessary and can be integrated, but...who generates it? where? who approves it? who can receive it? is there a recognized authority?
What about information generated by maintenance personnel for use by engineers (the stuff on the wall)? Is there an approval route? How authoritative is it?
In the past, the separation between maintenance and design authority was clear. As the maintenance and design operation moved closer (or become the same in BAE's case), the needed separation of content becomes much more challenging. Does linking from engineering authority content to non-engineering authority corrupt the authoritative content?
What level of authority does information have? Has it been tested? Have is gone through an approval route?
Approved data architecture. The challenge is to define a data architecture that includes all information issues by the design authority for the purpose of operating and/or mainteaining the platform in services, ensuring it is efficient, effective, and safe.
"This is a major content management issue." Indeed.
Many information deliverables go through rigorous approval process, but maintainers have access to other information, too. Official deliverables must be more integrated. Reference data and maintenance procedures come from different places in the organization, but they need to be in alignment. And there are "modifications," which must go in both places.
"This is not a trivial challenge." Yep.
The conflict here is really between data (approved content) and lore (unofficial information about how things really work). The mechanics have the "lore," and need to be persuaded to share it to improve the official documentation over time.
Labels: conferences, xml, xpubs
8:43 AM Permalink | |

XPubs: DITA implementation in progress
Monday, June 23, 2008 — posted by Sarah
Chris Hadley of Micro FocusNoz Urbin of Mekon
Micro Focus
12 writers in four locations
rapidly growing team, but also 20-year company veterans
Content is in XML, written using XMetaL, stored in CVS, DTDs and XSLT developed in-house.
Acquired companies have content in FrameMaker and Word.
Delivery in CHM, HTML, Help 2.
Lots of reuse in places; none in others.
No localization.
Problem very interesting because second generation of XML implementations are beginning. System was developed in-house, which was cutting edge at the time but now showing its age. It is costly to maintain, and the experts have left. Company cannot manage, debug, or improve the processes that exist.
Cannot continue to rely on in-house solutions because of risks of breakdowns. Change was required.
Business case 1: Value to customer
Improve quality of content with reuse, consistency, and accuracy.
Improve search. ("Content is great...when I can find it.")
Publishing to different formats.
Business case 2: Value to business
More accurate documentation with information easier to find means lower support cost due to fewer calls.
Lower cost on development. If support can find answers in documentation, there's no need to ask development team. Development team also uses documentation.
Reducing time on non-writing tasks such as builds means more time available for writing content.
Business case 3: Value to doc team
Replace aging systems and processes before they affect ability to deliver
Sustainable and manageable systems that can work even if there is employee turnover.
Align with industry standards (key reason). Open source expertise is transferable from other organizations, helps with recruitment.
Project started at the end of 2007. They engaged Mekon to help with the analysis of the current situation. By February 2008, they chose a CMS (TriSoft). Engaged Mekon to convert content to DITA. Installed and configured TriSoft. Primary consideration was cost, and they were required to give a budget figure to their board before they knew what they were going to be doing.
Why a CMS? They needed software that was supported externally instead of being produced in-house. Secondary priorities were reuse ("a single version of the truth") and publishing issues (multiple formats and audience, building and testing, link management).
Why DITA? Decision to DITA was very difficult. But the same arguments that applied to externally supported software applied to the DITA issues. Migration will be painful for acquired companies no matter what.
Why Mekon? Industry experience, skills range across the company, can talk about ROI, are independent, they are based locally (except Noz!).
Why the project will succeed. It must. Team is enthusiastic and motivated -- no issues with change resistance and recognized that the old system was unsustainable, already knew XML and XMetaL. Company's charter includes the phrase "debate passionately, get on board." And they did. Buy-in from the executive board was critical. Had to get senior VP to understand, support, and present to the board for approval. Early wins -- they already have demonstrable results. This is the right solution at the right time for this company.
Complexities and pitfalls
Didn't know what the end product was going to be. How do you learn what you need to when you don't know what you don't know? No in-house DITA experience, limited CMS knowledge. Also need to demonstrate results quickly while learning. Couldn't design perfect solution before starting, need to make changes during implementation. Balance between up-front research and management of progress expectations.
Content migration to DITA did not make the original timeline. Big reason for that is the "day job" -- writing content. So the initiative has to take a back seat to getting content out the door. Geographically dispersed team is difficult -- two locations haven't even looked at XML. Don't have a fully developed training structure yet.
The pilot project hasn't quite happened yet.
Preparing for migration is a horrible pain. Don't underestimate the cost of migration.
Publishing is complex, but it's taking time to get to a push-button process and it's not ready yet.
Planning to map to DITA, migrate to CMS. Will run old and new systems in parallel until they're sure that the new stuff is really working.
Need to re-architect content, integrate acquisitions, and refine and improve the process.
Business case for documentation team is very different from business case to board. The doc team's need for CMS needed to be presented in a way that would get approval.
This may not the right solution for everyone, but if it is, "get on with it."
CMS licenses -- some have high up-front cost but drop quickly per seat with bigger implementations. Others have low up-front cost but licensing per-seat doesn't drop much with greater volume. Makes cost evaluation different for different sized organization.
Another very interesting presentation. A rather aggressive timeline and the unique circumstance of lots of high-end skills (like XSLT development) in the documentation organization.
9:14 AM Permalink | |

XPubs: XSL-FO for Documentation Formatting
— posted by Sarah
Mike Miller, Antenna HouseFor starters, XSL-FO is an XML standard.
XSL-FO is "a pagination markup language describing a rendering vocabulary capturing the semantics of formatting information for paginated presentation." (Ken Holman)
Or, as I like to say, "A document layout described in a text file."
XSL-FO is black box formatting. Can't go back and "tweak" the files to fix them. With FO, you're typically talking about a minimum of a couple hundred pages. Much faster to render automatically rather than by hand in InDesign or FrameMaker.
First commercial products in 2001 from Antenna House and RenderX. Also, open source FOP from Apache in 2001. FO successful in the sense that both commercial companies are doing quite well.
FO more successful than any other technical publishing application other than perhaps TeX and FrameMaker. Probably attributable to the availability of open source (free) and trial versions from commercial vendors (free).
XSL-FO is only concerned with visual display of XML data, which means that the FO file has no semantic content, only formatting instructions.
The FO stylesheet specifies:
- page areas and sets of pages to be used to compose a document for paper (master pages)
- Text flows, areas on pages into which the text and graphics are filled
- Blocks within flow areas (paragraphs)
- Inline areas (character-level formatting)
- Processing and formatting are consistent and automatic.
- Formatting rules are stored separately from the data.
- FO is non-proprietary and human-readable (well, sort of)
- FO less complicated than programming Java or Perl and the like
- Can use stylesheets with different XSLT processors (DITA Open Toolkit)
- Easier integration with other XML standards compliant applications (not trivial, but much easier than other non-standard approaches)
Most business documents can be formatted automatically as FO. Rule of thumb: "If it's XML, FO can be applied."
Other applications for FO might include faxes, German railway tickets, correspondence from financial institutions and government.
Typesetting is very complex with issues like widows and orphans and hyphenation. Software can handle this. Human typesetters have been removed from the process, and this shows in amateurish mistakes. But you can use FO to configure something that follows typography rules and give you a professional look and feel.
"Overwhelming benefits" of using FO. Which begs the question: "Why aren't more people using it?" A slide with the benefits of XML showing The Usual (cost, time-to-market, less redundancy, standards-based, localization for cost justification, etc.).
People who use FO: auto manufacturers, cell phone manufacturers, banks, aerospace, government, military, educational
FO not appropriate for documents that are "artistically created."
FO extensions provide support for:
- Document info in PDF
- Bookmarks for PDF
- Column footnotes
- Revision bars
- MathML
- Embedding PDF within PDF
- Column rules
- Punctuation spacing
- Table autospace
- Floats
- Advanced hyphenation
- Barcodes
- several hundred extensions altogether. Antenna House uses multilingual requirements with extensions, such as special spacing requirements in Japanese or justification in Arabic through kashidas.
DITA Open Toolkit reduces complexity of getting set up and produce PDF. Could be configured and producing PDF in "a couple of hours." (Perhaps, but making it look the way you want is going to take a while.) According to Mike, somewhere between a few days and a few months, depending on the complexity of your requirements.
PDF output from DITA
- XSL-FO
- FrameMaker
- troff
- Preprocessing. Information is parsed and assembled.
- Transformation. Formatted and generated.
Why not FrameMaker or InDesign?
- Formatting is the tip of the iceberg. (WYSIWYG)
- WYDSIWYN -- What you don't see is what you need, which includes content management, automated formatting, multilingual formatting, global access, project tracking, electronic delivery, network integration
- You need to manually lay out pages.
- No fixed page style
- Need to modify page layout
- Unstructured document formats
- Document format is continuously changing
- Unstructured content
On the low end, FO is free with FOP. Antenna House is most expensive at $1250 for stand-alone or server license for $5,000.
FO supports more languages than any other solution currently available.
Solving the real problem:
- Improve the total process, not just individual tasks
- Improve organizational effectiveness
First question: Flowing text into typesetting engine results in line breaks that will cause readers difficulty. And this annoys him (as a professional typesetter). We want powerful, automated formatting AND the ability to do WYSIWYG tweaks. Thinks there is a role for a WYSIWYG stage after the automation bit.
I've noticed this on the BBC, too. British people ask really pointed questions.
And in response, Mike says that Antenna House has a solution for this where you create INX (InDesign XML) content (4 minutes) and then you can pull it into InDesign (half an hour), and do some cleanup.
Do all the XSL-FO tools cover 100% of the FO standard? "No, definitely not."
Labels: conferences, dita, xml, xpubs, xsl
7:50 AM Permalink | |

X-Pubs keynote: Transforming Legislation Publishing
— posted by Sarah
Brief introduction from Noz Urbina and an overview of the conference from Julian Murfitt. Some X-Pubs housekeeping items, including a flight announcement..."Should a presentation be boring and sleep deprivation set in, oxygen masks will drop from the ceiling. Please put on your own mask before assisting others."Hehe.
On to the keynote...John Sheridan, Head of e-Services at the Office of Public Sector Information, National Archives. Eeek, slight problem with slides -- and the presenter just launches right in without them. I bet he's terrified right now, but he looks perfectly composed.
We have slides. "Transforming Legislation Publishing"
Publishing legislation seem dry, but in fact it's quite relevant to the people at large -- and ignorance of the law is no excuse. Legislation documents use XML under the covers. Have been publishing legislation online since 1996 and of course print for a long time before that.
Strengths of their service:
- Immediacy: published online simultaneously with print versions. Important because some measures go into effect the day of their enactment.
- Accuracy: value of service hinges on knowing that the rendition of the online content is the same as the official vellum statement signed by the Queen. (Vellum? Really??)
- Trust: Do customers trust that what they see online is an official source? Based on eye-tracking software, they found when asked about trust, customers looked at the official crest and then responded positively.
- Reach: 1.5 million users, mostly in the UK

Key performance indicator: two clicks. 80 percent of information should be available in two clicks:
- Google search button.
- Click link on first page of results.
- Legacy workflows
- Multiple document inputs. Coming from Parliament, government lawyers in 21 departments, Scotland, secondary legislation, Welsh measures, legislation from Northern Ireland, church materials, dual languages in Wales (English and Welsh).
- Tools include: FrameMaker for some groups; Word for others
- Legacy content: 55,000 documents that needed to be repurposed from SGML to XML to improve web publishing.
Persistent linking, "web continuity", overall 60 percent of links to official information are broken. Their solution to "persist" the 500,000 existing links was to provide redirection behavior, so that every URL resolves either to live content or to the government's archive on the web.
XML is the key to solving these assorted issues.
Trying to "future-proof" their work, especially by providing a way to allow for changing web standards (HTML/web standard may change, but we can keep underlying XML).
Legislative documents are highly structured but also have variations over time. Very difficult to capture in a structure. "Parliament trumps your XML schema." You can't say, "Sorry, but that won't work in our schema, so you can't pass that legislation." Must find the balance between accommodating what's needed and "allowing everything."
They developed Crown XML:
The Crown XML Schema for Legislation provides a full and comprehensive encoding for all United Kingdom primary and secondary legislation. It has been written using the World Wide Web Consortium XML Schema language and is the Government's official and authoritative data standard for legislation. Once a piece of legislation has been enacted or made, it is stored using this Schema format. Schema compliant legislation is available in XML for onward supply to legal publishers and others.They provide sample documents, which even so cover only about half the possibilities in the full schema.
Users have options for various views of the legislation.
Their work leads to the concept of the web as a platform. Not just providing for users to consume, but also to reuse, aggregate, and combine.
Mixing data...hey, cookie dough!
The government's response to Web 2.0 trends. Government should enable information so that citizens can use the information. Doing so will lead not only to better public services, but also to other services, both commercial and noncommercial.
Problems include culture, rights, licensing, intellectual property, and technology challenges. Information becomes infrastructure and potentially as important as roads and other physical infrastructure. Legislation is widely cited content, which becomes infrastructure for other things. Legislation needs to be addressable with fragment identifiers, so that people can cite specific sections or paragraph rather than an entire act.
Why couldn't lawyers add editorial value to legislation in a wiki-type format. Not a job for the state, but something that could be enabled or inhibited by how the legislation is published. Providing addressable content and using standards would allow for third parties to use the legislation as a starting point for additional work.
They provide Atom (RSS) feeds for new legislation.
Library of Congress is an example of a re-user of UK legislation. UK legislation of interest for comparison purposes. They have a "PDF thing going on." Really wanted access to PDF versions of the information. Subscribe to the Atom feed, and the PDF will pop up there as a link.
Expect reuse for very granular areas...discussion of specific industries or topics. (If mad cow disease were to reoccur, expect footpaths to be closer, and a map could show in real-time what's open and what's closed.)
Providing sufficient flexibility into structure without descending into tag soup.
First question: Is the raw XML available to the public?
Yikes. The presenter hesitates and is quite uncomfortable. Seemed like a harmless enough question but apparently not. The answer is that it's available by subscription -- that is, lawyers pay to get access to it. They must balance between their economics and subscription income. They would like to publish XML; seems to be the direction that public policy is going. But "don't want to spend taxpayers' money to subsidize Lexis-Nexis."
Second question: Would these policies extend to others, like the Department for Transport?
Again, this sounds harmless to me, but appears to be quite controversial. Information produced as a core public task ("which is nowhere defined clearly") is public.
Really, when will government policy help the questioner push his employer into using structure? Interesting. I don't think we'd get that question in the U.S., other than in the negative.
Conferences here are so civilized, with the opening session at 10 a.m. Ahhhhh. Tea and cookies, er, biscuits at the breaks. Luvely.
Labels: conferences, web 2.0, xml, xpubs
7:39 AM Permalink | |

