XPubs: Information integration and the needs of the (product) maintainer
Chris Wood
BAE Systems
Tech pubs managers at BAE, contributor to S1000D standard.
Electronic maintenance (interactive electronic technical manual, or IETM) has been shown to deliver increase in fault finding success, reduction in troubleshooting time, and reduction in maintenance errors. "Fairly comforting"
Market drivers for integrated information...output-based contracts. The Royal Air Force is asking vendors to take on more maintenance activities. The drivers for success for the commercial organization are different from the drivers for the military.
BAE must guarantee that a specific number of aircraft ("platforms") are available to fly at all times. Financial penalties for not meeting those goals.
Offshore commodity outsourcing is putting pressure on the prices that BAE can quote. Price "per page" needs to be on a downward slope.
IETM capability offers an opportunity to integrate support information applications and processes.
ATTAC Contract = a certain number of Tornado aircraft must be available 24x7. BAE is responsible for preflight, postflight, AND other maintenance. Spare parts come out of BAE's budget. Therefore, reducing spare parts "footprint" saves money.
18 million pounds (double that for dollars) over 10 years. Their target is to save more than 18M pounds by including rich data (photos, video, 3D animation), align with actual maintenance activities, tech pubs people on-base as part of integrated engineering team.
Nice example of specific changes in tech docs leading to large cost savings due to fewer returns for repair.
Aha. They improved the official documentation by picking up information that was "plastered on the wall" in the aircraft hangar. In other words, user-generated content!
Information integration...the issue
Too much information, which is necessary and can be integrated, but...who generates it? where? who approves it? who can receive it? is there a recognized authority?
What about information generated by maintenance personnel for use by engineers (the stuff on the wall)? Is there an approval route? How authoritative is it?
In the past, the separation between maintenance and design authority was clear. As the maintenance and design operation moved closer (or become the same in BAE's case), the needed separation of content becomes much more challenging. Does linking from engineering authority content to non-engineering authority corrupt the authoritative content?
What level of authority does information have? Has it been tested? Have is gone through an approval route?
Approved data architecture. The challenge is to define a data architecture that includes all information issues by the design authority for the purpose of operating and/or mainteaining the platform in services, ensuring it is efficient, effective, and safe.
"This is a major content management issue." Indeed.
Many information deliverables go through rigorous approval process, but maintainers have access to other information, too. Official deliverables must be more integrated. Reference data and maintenance procedures come from different places in the organization, but they need to be in alignment. And there are "modifications," which must go in both places.
"This is not a trivial challenge." Yep.
The conflict here is really between data (approved content) and lore (unofficial information about how things really work). The mechanics have the "lore," and need to be persuaded to share it to improve the official documentation over time.
Labels: conferences, xml, xpubs
XPubs: DITA implementation in progress
Chris Hadley of Micro Focus
Noz Urbin of Mekon
Micro Focus
12 writers in four locations
rapidly growing team, but also 20-year company veterans
Content is in XML, written using XMetaL, stored in CVS, DTDs and XSLT developed in-house.
Acquired companies have content in FrameMaker and Word.
Delivery in CHM, HTML, Help 2.
Lots of reuse in places; none in others.
No localization.
Problem very interesting because second generation of XML implementations are beginning. System was developed in-house, which was cutting edge at the time but now showing its age. It is costly to maintain, and the experts have left. Company cannot manage, debug, or improve the processes that exist.
Cannot continue to rely on in-house solutions because of risks of breakdowns. Change was required.
Business case 1: Value to customer
Improve quality of content with reuse, consistency, and accuracy.
Improve search. ("Content is great...when I can find it.")
Publishing to different formats.
Business case 2: Value to business
More accurate documentation with information easier to find means lower support cost due to fewer calls.
Lower cost on development. If support can find answers in documentation, there's no need to ask development team. Development team also uses documentation.
Reducing time on non-writing tasks such as builds means more time available for writing content.
Business case 3: Value to doc team
Replace aging systems and processes before they affect ability to deliver
Sustainable and manageable systems that can work even if there is employee turnover.
Align with industry standards (key reason). Open source expertise is transferable from other organizations, helps with recruitment.
Project started at the end of 2007. They engaged Mekon to help with the analysis of the current situation. By February 2008, they chose a CMS (TriSoft). Engaged Mekon to convert content to DITA. Installed and configured TriSoft. Primary consideration was cost, and they were required to give a budget figure to their board before they knew what they were going to be doing.
Why a CMS? They needed software that was supported externally instead of being produced in-house. Secondary priorities were reuse ("a single version of the truth") and publishing issues (multiple formats and audience, building and testing, link management).
Why DITA? Decision to DITA was very difficult. But the same arguments that applied to externally supported software applied to the DITA issues. Migration will be painful for acquired companies no matter what.
Why Mekon? Industry experience, skills range across the company, can talk about ROI, are independent, they are based locally (except Noz!).
Why the project will succeed. It must. Team is enthusiastic and motivated -- no issues with change resistance and recognized that the old system was unsustainable, already knew XML and XMetaL. Company's charter includes the phrase "debate passionately, get on board." And they did. Buy-in from the executive board was critical. Had to get senior VP to understand, support, and present to the board for approval. Early wins -- they already have demonstrable results. This is the right solution at the right time for this company.
Complexities and pitfalls
Didn't know what the end product was going to be. How do you learn what you need to when you don't know what you don't know? No in-house DITA experience, limited CMS knowledge. Also need to demonstrate results quickly while learning. Couldn't design perfect solution before starting, need to make changes during implementation. Balance between up-front research and management of progress expectations.
Content migration to DITA did not make the original timeline. Big reason for that is the "day job" -- writing content. So the initiative has to take a back seat to getting content out the door. Geographically dispersed team is difficult -- two locations haven't even looked at XML. Don't have a fully developed training structure yet.
The pilot project hasn't quite happened yet.
Preparing for migration is a horrible pain. Don't underestimate the cost of migration.
Publishing is complex, but it's taking time to get to a push-button process and it's not ready yet.
Planning to map to DITA, migrate to CMS. Will run old and new systems in parallel until they're sure that the new stuff is really working.
Need to re-architect content, integrate acquisitions, and refine and improve the process.
Business case for documentation team is very different from business case to board. The doc team's need for CMS needed to be presented in a way that would get approval.
This may not the right solution for everyone, but if it is, "get on with it."
CMS licenses -- some have high up-front cost but drop quickly per seat with bigger implementations. Others have low up-front cost but licensing per-seat doesn't drop much with greater volume. Makes cost evaluation different for different sized organization.
Another very interesting presentation. A rather aggressive timeline and the unique circumstance of lots of high-end skills (like XSLT development) in the documentation organization.
XPubs: XSL-FO for Documentation Formatting
Mike Miller, Antenna House
For starters, XSL-FO is an XML standard.
XSL-FO is "a pagination markup language describing a rendering vocabulary capturing the semantics of formatting information for paginated presentation." (Ken Holman)
Or, as I like to say, "A document layout described in a text file."
XSL-FO is black box formatting. Can't go back and "tweak" the files to fix them. With FO, you're typically talking about a minimum of a couple hundred pages. Much faster to render automatically rather than by hand in InDesign or FrameMaker.
First commercial products in 2001 from Antenna House and RenderX. Also, open source FOP from Apache in 2001. FO successful in the sense that both commercial companies are doing quite well.
FO more successful than any other technical publishing application other than perhaps TeX and FrameMaker. Probably attributable to the availability of open source (free) and trial versions from commercial vendors (free).
XSL-FO is only concerned with visual display of XML data, which means that the FO file has no semantic content, only formatting instructions.
The FO stylesheet specifies:
- page areas and sets of pages to be used to compose a document for paper (master pages)
- Text flows, areas on pages into which the text and graphics are filled
- Blocks within flow areas (paragraphs)
- Inline areas (character-level formatting)
- Processing and formatting are consistent and automatic.
- Formatting rules are stored separately from the data.
- FO is non-proprietary and human-readable (well, sort of)
- FO less complicated than programming Java or Perl and the like
- Can use stylesheets with different XSLT processors (DITA Open Toolkit)
- Easier integration with other XML standards compliant applications (not trivial, but much easier than other non-standard approaches)
Most business documents can be formatted automatically as FO. Rule of thumb: "If it's XML, FO can be applied."
Other applications for FO might include faxes, German railway tickets, correspondence from financial institutions and government.
Typesetting is very complex with issues like widows and orphans and hyphenation. Software can handle this. Human typesetters have been removed from the process, and this shows in amateurish mistakes. But you can use FO to configure something that follows typography rules and give you a professional look and feel.
"Overwhelming benefits" of using FO. Which begs the question: "Why aren't more people using it?" A slide with the benefits of XML showing The Usual (cost, time-to-market, less redundancy, standards-based, localization for cost justification, etc.).
People who use FO: auto manufacturers, cell phone manufacturers, banks, aerospace, government, military, educational
FO not appropriate for documents that are "artistically created."
FO extensions provide support for:
- Document info in PDF
- Bookmarks for PDF
- Column footnotes
- Revision bars
- MathML
- Embedding PDF within PDF
- Column rules
- Punctuation spacing
- Table autospace
- Floats
- Advanced hyphenation
- Barcodes
- several hundred extensions altogether. Antenna House uses multilingual requirements with extensions, such as special spacing requirements in Japanese or justification in Arabic through kashidas.
DITA Open Toolkit reduces complexity of getting set up and produce PDF. Could be configured and producing PDF in "a couple of hours." (Perhaps, but making it look the way you want is going to take a while.) According to Mike, somewhere between a few days and a few months, depending on the complexity of your requirements.
PDF output from DITA
- XSL-FO
- FrameMaker
- troff
- Preprocessing. Information is parsed and assembled.
- Transformation. Formatted and generated.
Why not FrameMaker or InDesign?
- Formatting is the tip of the iceberg. (WYSIWYG)
- WYDSIWYN -- What you don't see is what you need, which includes content management, automated formatting, multilingual formatting, global access, project tracking, electronic delivery, network integration
- You need to manually lay out pages.
- No fixed page style
- Need to modify page layout
- Unstructured document formats
- Document format is continuously changing
- Unstructured content
On the low end, FO is free with FOP. Antenna House is most expensive at $1250 for stand-alone or server license for $5,000.
FO supports more languages than any other solution currently available.
Solving the real problem:
- Improve the total process, not just individual tasks
- Improve organizational effectiveness
First question: Flowing text into typesetting engine results in line breaks that will cause readers difficulty. And this annoys him (as a professional typesetter). We want powerful, automated formatting AND the ability to do WYSIWYG tweaks. Thinks there is a role for a WYSIWYG stage after the automation bit.
I've noticed this on the BBC, too. British people ask really pointed questions.
And in response, Mike says that Antenna House has a solution for this where you create INX (InDesign XML) content (4 minutes) and then you can pull it into InDesign (half an hour), and do some cleanup.
Do all the XSL-FO tools cover 100% of the FO standard? "No, definitely not."
Labels: conferences, dita, xml, xpubs, xsl
X-Pubs keynote: Transforming Legislation Publishing
Brief introduction from Noz Urbina and an overview of the conference from Julian Murfitt. Some X-Pubs housekeeping items, including a flight announcement...
"Should a presentation be boring and sleep deprivation set in, oxygen masks will drop from the ceiling. Please put on your own mask before assisting others."Hehe.
On to the keynote...John Sheridan, Head of e-Services at the Office of Public Sector Information, National Archives. Eeek, slight problem with slides -- and the presenter just launches right in without them. I bet he's terrified right now, but he looks perfectly composed.
We have slides. "Transforming Legislation Publishing"
Publishing legislation seem dry, but in fact it's quite relevant to the people at large -- and ignorance of the law is no excuse. Legislation documents use XML under the covers. Have been publishing legislation online since 1996 and of course print for a long time before that.
Strengths of their service:
- Immediacy: published online simultaneously with print versions. Important because some measures go into effect the day of their enactment.
- Accuracy: value of service hinges on knowing that the rendition of the online content is the same as the official vellum statement signed by the Queen. (Vellum? Really??)
- Trust: Do customers trust that what they see online is an official source? Based on eye-tracking software, they found when asked about trust, customers looked at the official crest and then responded positively.
- Reach: 1.5 million users, mostly in the UK

Key performance indicator: two clicks. 80 percent of information should be available in two clicks:
- Google search button.
- Click link on first page of results.
- Legacy workflows
- Multiple document inputs. Coming from Parliament, government lawyers in 21 departments, Scotland, secondary legislation, Welsh measures, legislation from Northern Ireland, church materials, dual languages in Wales (English and Welsh).
- Tools include: FrameMaker for some groups; Word for others
- Legacy content: 55,000 documents that needed to be repurposed from SGML to XML to improve web publishing.
Persistent linking, "web continuity", overall 60 percent of links to official information are broken. Their solution to "persist" the 500,000 existing links was to provide redirection behavior, so that every URL resolves either to live content or to the government's archive on the web.
XML is the key to solving these assorted issues.
Trying to "future-proof" their work, especially by providing a way to allow for changing web standards (HTML/web standard may change, but we can keep underlying XML).
Legislative documents are highly structured but also have variations over time. Very difficult to capture in a structure. "Parliament trumps your XML schema." You can't say, "Sorry, but that won't work in our schema, so you can't pass that legislation." Must find the balance between accommodating what's needed and "allowing everything."
They developed Crown XML:
The Crown XML Schema for Legislation provides a full and comprehensive encoding for all United Kingdom primary and secondary legislation. It has been written using the World Wide Web Consortium XML Schema language and is the Government's official and authoritative data standard for legislation. Once a piece of legislation has been enacted or made, it is stored using this Schema format. Schema compliant legislation is available in XML for onward supply to legal publishers and others.They provide sample documents, which even so cover only about half the possibilities in the full schema.
Users have options for various views of the legislation.
Their work leads to the concept of the web as a platform. Not just providing for users to consume, but also to reuse, aggregate, and combine.
Mixing data...hey, cookie dough!
The government's response to Web 2.0 trends. Government should enable information so that citizens can use the information. Doing so will lead not only to better public services, but also to other services, both commercial and noncommercial.
Problems include culture, rights, licensing, intellectual property, and technology challenges. Information becomes infrastructure and potentially as important as roads and other physical infrastructure. Legislation is widely cited content, which becomes infrastructure for other things. Legislation needs to be addressable with fragment identifiers, so that people can cite specific sections or paragraph rather than an entire act.
Why couldn't lawyers add editorial value to legislation in a wiki-type format. Not a job for the state, but something that could be enabled or inhibited by how the legislation is published. Providing addressable content and using standards would allow for third parties to use the legislation as a starting point for additional work.
They provide Atom (RSS) feeds for new legislation.
Library of Congress is an example of a re-user of UK legislation. UK legislation of interest for comparison purposes. They have a "PDF thing going on." Really wanted access to PDF versions of the information. Subscribe to the Atom feed, and the PDF will pop up there as a link.
Expect reuse for very granular areas...discussion of specific industries or topics. (If mad cow disease were to reoccur, expect footpaths to be closer, and a map could show in real-time what's open and what's closed.)
Providing sufficient flexibility into structure without descending into tag soup.
First question: Is the raw XML available to the public?
Yikes. The presenter hesitates and is quite uncomfortable. Seemed like a harmless enough question but apparently not. The answer is that it's available by subscription -- that is, lawyers pay to get access to it. They must balance between their economics and subscription income. They would like to publish XML; seems to be the direction that public policy is going. But "don't want to spend taxpayers' money to subsidize Lexis-Nexis."
Second question: Would these policies extend to others, like the Department for Transport?
Again, this sounds harmless to me, but appears to be quite controversial. Information produced as a core public task ("which is nowhere defined clearly") is public.
Really, when will government policy help the questioner push his employer into using structure? Interesting. I don't think we'd get that question in the U.S., other than in the negative.
Conferences here are so civilized, with the opening session at 10 a.m. Ahhhhh. Tea and cookies, er, biscuits at the breaks. Luvely.
Labels: conferences, web 2.0, xml, xpubs
STC UK...almost live, part 2...Managing change
Ant Davey, Rail Standards Safety Board (RSSB)
Another excellent session. Ant provided a discussion of change management with quite a lot of references to more detailed resources.
Knowledge is being lost.
Information has value.
Web is changing search methods and expectations.
Web is changing ability to contribute and review content.
Findable information requires chunking.
Chunks are potentially reusable.
Ultimately, you have single sourcing.
Not "how we have always done it."
Chunking requires modular and collaborative writing.
Not "how we have always done it."
Single sourcing @ RSSB
* Still finding our way
* Technology led (in part)
* Starting to introduce standard templates
* Paving the way by communicating
* Planning an XML pilot
Change management
* Linking people and processes toward a desired change
* change is not where you are now
* need to know where you are going and tell those who you want to come with you
* "you are almost certainly under-communicating by a factor of at least 10 and possibly 100"
Carrying people with you. People view change as an attack on their current competence. Need to begin by celebrating what they have been doing right. 5-25% of people can't or won't be able to work with the new processes (Emma Hamer)
The change equation:
C = (ABD) > X
C = change
A = dissatisfaction with status quo
B = desirability of the new end state
D = risk and disruption to get there
X = cost of changing (effort, discomfort, difficulty, risk)
Carrying people with you
* celebration with is working
* explain what isn't and why
* describe how it will be with the new methods
* what's in it for them
* what's in it for the company
* what's in it for the clients
WIIFM = What's in it for me
* Active supporters
* Active dissenters
* Passive supporters
* Passive dissenters
Creating a change team
* You can't do all this by yourself.
* Special skills, talents, and leadership
* Where you can't carry, you may have to push or reallocate
First, Break All the Rules
Marcus Buckingham
Leadership is different from management. (That is SO true.)
Team members
* Champion or sponsor
* sustaining sponsor
* implementer
* change agent
* advocate
* group
* different styles, methods, and needs
* different personality types
* similar team gets quick results
* team with differences gets better result
Where change management goes wrong
* too much complacency
* lack of power in the guiding team
* not having real vision
* under-communicating (effectively)
* allowing obstacles to block the vision
* no short-term wins
* declaring victory too soon
* not embedding changes in practice
learning cycle
* concrete experience for activists
* reflective observation for reflectors
* theoretical concepts for theorist
* practical experimentation for pragmatists
(Experiential Learning, Kolb)
change
* needs leadership and vision
* needs good management
* needs metrics
* because ultimately it's about money
* increase revenue
* decrease costs
* make people's lives easier
* concentrate on the outcomes
* leave individuals to develop their own implementation plans
business process re-engineering
Customer led analysis method for business re-engineering
1. establish the scope
2. target the customer
3. model the process
4. analyze the structure
5. create the opportunity
6. redesign the process
7. refine the customer experience
8. ??
Why?
* customers dissatisfied
* position in value chain changes
* move from product to service or vice versa
* merger with another organization
* has to be customer-led
* good business case
* beware of targets (wrong targets lead to undesired behavior)
Influencing others
* getting results with authority
* you can't change other people
* you can change what you do, which may change how others react to you
* need to be politically savvy
Effective influence
* open
* honesty
* integrity
* loyalty
* rapport
* adult to adult communication and relationships
* maximal listening
* dovetailing needs outcomes
Strategies
* Logic
* Personal appeal
* Networking
* Bargaining
* Assertiveness
* Hierarchical appeal
Great presentation, and happy to see that my anecdotal experience has some amount of overlap with Ant's much more research-backed approach.
Labels: change management, stc2008, xml
STC 2008: Wrap-up
Many thanks to those of you who stopped by the booth to meet us. We especially appreciate visitors who tell us that they read and enjoy our content, whether books, white papers, or this blog.
I had numerous requests for my paradigm shift presentation slides, so I am making them available here:
My next round of conferences will be in the UK. I'm leading an XSL workshop for STC UK on June 22 and giving a presentation on June 21 as part of the Trends in Technical Communication event. Then, it's onward to X-Pubs, where I'll be discussing the implications of Web 2.0 on technical communication.
As far as I know, after that I'm done with the conference circuit until the fall. However, senior technical consultant Simon Bate will be attending the Gilbane conference in San Francisco and participating on a DITA panel. Please contact us if you'd like to set up a meeting at the conference.
Labels: change management, conferences, stc2008, xml
A Quarky new approach?
Recently, Quark has announced their new dynamic publishing concept and/or solution.
Where to start?
Although traditional publishing allows each author to hand-craft the appearance of each page, the limitation is that it ties information to the way it is presented. This means that if you want to publish the same information in print, Web, and electronic formats, then you have to create an entirely separate version of your information for each media type.Fascinating, but it sounds oddly familiar. Where could I have heard this before? Wait! This sounds like an argument for...single sourcing!
[S]ingle sourcing means writing information onceThat would be from The Impact of Single Sourcing and Technology by Ann Rockley, published in Technical Communication in 2001.
and using it many times. It does not mean writing it and
then copying and pasting it into another source, or modifying
the information for different needs such that you have
multiple sources.
The term "single sourcing" also appears in Designing Windows 95 Help: A Guide to Creating Online Documents, which was published in 1996 (!). You can see excerpts via Google Books. I'm sure there's more, but 1996 is plenty early.
Anyway, back to Quark:
Sorry, guys, but what you're describing is "single sourcing" and it's been around for a while. And I don't think redefining "dynamic publishing" is going to work, either, because that term already means something. Dynamic publishing can refer to the following:Dynamic publishing is a different way to create and share information. Dynamic publishing lets you create information as reusable components of information that you can easily combine for different uses - different types of documents and different audiences.
Dynamic publishing also automates the page formatting process, so you can automatically produce print, Web, and electronic content from a single source of information.
- Publishing on the fly: The information presented is based on the end user's requests and/or profile. Information is assembled when the user requests it (and not ahead of time).
- Customized publishing (or variable data publishing): The process of publishing content where the information varies but the overall organization stays the same. Financial statements are a good example of this type of publishing -- each customer needs their specific transactions on the page.
Arbortext. Hmmmm. There's something about Arbortext....
And here is where the situation gets truly weird. Take a look at the Quark executive biographies page. Of the ten people listed, five are ex-Arbortext, including the CEO, CIO, marketing VP, and two of three sales VPs.
So, Quark is the recipient of some sort of a multiple-organ management transplant from Arbortext. Given the rumors that the Arbortext-PTC merger hasn't been exactly a lovefest, the departure of senior management and others isn't surprising. It's their reappearance at a single company that's striking. And furthermore, it appears that they are trying to create Arbortext, MarComm Edition.
Will this work? The landscape is pretty bleak.
Here is an excerpt from Eric Kuhnen's analysis (published on TheContentWrangler.com, and you should read the entire thing):
Quark, in proposing to integrate a CMS into its Dynamic Publishing Solution, has just added a well known set of problems to their offering. There are literally dozens of CMS-enabled solutions on the market already; Quark’s entry is nothing new (well, it is to Quark but not to its customers). It’s not that adding the CMS itself is the wrong idea, but that incorporating a traditional CMS will yield fewer benefits to the customers in the markets it serves, and will not do much to displace the leading ECM vendors in the markets it would like to serve. So, Quark will follow the road it has always taken.(Emphasis mine)
A variation on this theme is found in an interview with Raymond Schiavone conducted by Pariah S. Burke, editor of QuarkVsInDesign.com (again, read it all, especially the analysis of the interview on the third and fourth pages). This excerpt is from Burke's analysis:
I think QuarkXPress will continue to have utility on its own, but its primary role will be to function as a desktop client for an as-yet unrevealed enterprise-grade suite of systems.The existence of InDesign Server notwithstanding, I think the overall analysis makes sense. Basically, transitioning Quark into a server-based publishing system requires moving away from freelancers and small business customers. They can't afford and don't need server-based publishing. Instead, Quark needs to make inroads into large companies with large marketing departments. And there, they run up against the twin buzzsaws of InDesign and existing competition in the content management space. This might work if Quark's offering was deeply compelling, unique, and game-changing. In its current version, it appears to be none of the above.
XPress 8 will be the first stage, I predict. [... Schiavone's] realistic goal for the XPress 8 generation of products will be to make the market take notice of Quark again, to open a dialog with large workflow managers who will help refine Schiavone’s vision for XPress 9.
By the time XPress 9 and its matching systems do release (probably less than 12 months following the release of version 8), QuarkXPress will be little more than a client application. All the real power will reside on the server-side systems. More importantly, by abandoning the so-called “feature war” with InDesign, Quark will create a lopsided conundrum for potential users—you can have near total automation of your publishing and production, with output to print, PDF, PDF/X, HTML, XML, and everything else you can think of, but without certain creativity, composition, and proofing features the competition will have had for generations.
The most difficult part of any change in technology is end user adoption. I've discussed change management on this blog and elsewhere. Bringing XML and automation into a marketing or publishing workflow is going to present some unique challenges.
In publishing (not technical publications), the deliverable is in fact the product. As a book publisher, you care greatly about the appearance of your final product, the book. In technical publishing, the appearance of the documentation is often negotiable, and making the inevitable compromises on formatting to get better automation is an acceptable tradeoff. This may not be true for most magazine and book publishers. (It's worth noting that the most technical of trade book publishers, O'Reilly Media, was also the first, as far as I know, to move to XML-based publishing.) Quark grudging acknowledges the challenge in the description of their solution:
"Cobbled together"?Dynamic publishing started in the realm of technical documentation, where large manufacturers and some types of publishers have implemented dynamic publishing to produce user guides, service manuals, parts catalogs, legal documentation, and similar types of information.
Some publishers have built their own dynamic publishing systems for publications that have more elaborate layout requirements than technical documentation, but these systems have been cobbled together from multiple technologies. In many cases, they have achieved some of their business goals but at the expense of far higher process costs.
"Pot? This is Kettle. How you doin'?"
Here is a description of what's in Quark's DPS (from the Quark DPS FAQ)
Quark Dynamic Publishing Solution (DPS) is publishing software. It consists of multiple software components, some from Quark and some from third parties, including:(Image from Quark's web site: http://dynamicpublishing.quark.com/dps/how_it_works.html)(emphasis mine)
- Optional desktop products for creating content: QuarkXPress, QuarkCopyDesk®, Xpress™ Author for Microsoft® Word, Adobe® InCopy® and InDesign®
- Standard server-based publishing software: QuarkXPress Server and Quark Transformation Engine, for publishing to print and electronic media
- Standard server-based product for automating workflow: Quark Publishing System
- Optional browser-based product for content creation, final document edits and reviewing
- Integration with server-based products for content management partners such as Alfresco®
Here is a really accurate bit of information. In response to the question, "How will dynamic publishing affect me and my employees?", we have this:
The primary impact is on the authoring process. Dynamic publishing shifts the authoring focus from hand-crafting pages to creating information that is independent of any specific media type, which means that authors stop worrying about how the information looks and instead focus on writing it. Authors also shift from creating monolithic documents to writing small, reusable components of information.There is a world of pain hidden in those three sentences. In my experience, the more creative technical writers have a more difficult time with XML than the more engineering-oriented writers. Let's graph from most technical to least technical:
engineers >> technical writers >> marketing writers
Uh-oh. Getting marketing people to follow structured authoring concepts is going to be really difficult.
A couple of final notes:
- The Quark-written content attempts to position this solution as the logical response to non-single source workflows. This is silly. I'd like to see a discussion of what makes Quark's approach to single sourcing better, faster, and/or cheaper than others.
- There's a discussion of return on investment, which includes this gem: "the return on investment can take from six to eighteen months." Indeed. It can also take forever. Not every organization will be able to show ROI for this solution, and claiming otherwise is ridiculous.
DocTrain: Dynamic Publishing
Once Content is in XML. Now what?
Learn How Dynamic Publishing Can Help You Improve the Re-use and Value of XML Content
Joshua Duhl
Quark
He begins with a lengthy explanation of why single-sourcing is a Good Thing, which I rather think might be unnecessary for this audience.
According to Mr. Duhl, most organizations are using print-based workflows or print-based workflows with an add-on for the web. Again, wrong audience.
The web mobile devices, and electronic communications have altered the fundamental principles of publishing: Content everywhere.Pitfalls of traditional publishing
- Processes are costsly
- Updates are slow
- Information is often out-of-date
- Content is prone to errors
- Customers are unhappy
- Deadlines are missed
Graphing complexity against volume
- high complex/low volume: tech doc
- high volume/low complex: statements, invoices
- in the middle: correspondence
- need to work with content from multiple sources
- publishing to multiple sources or for multiple sources
- enable content for re-use beyond the tech doc
- to use a single system that holds all information
- to have an automated workflow that ensures approved content is automatically published for each edition and different devices
Core principles
- content centric
- single source
- reuse strategy
Content is created regardless of format, layout, or media (content first)
single source
plan for reuse, support for variations and alternatives
leveraging XML
format versus structure
What is Quark Dynamic Publishing Solution
* QuarkXPress
* plus workflow
* dynamic publishing
OK, so I finally understand my issues with this...in a world where people are componentizing and picking and choosing their solutions, why would they go to a monolithic approach?
Create
* QXP
* Indesign
Word
XML
WEb
manage
workflow system/check-in/out etc
publish
QXP server
* Quark transformation engine
* XML transformation rules
delivery
rendered formats
Sorry the notes are so messy; this presentation went very fast due to some scheduling issues that were not the presenters fault.
But overall, Quark is proposing a "dynamic publishing solution" that enables single-sourcing workflows based on XML.
Labels: doctrainwest08, xml
DocTrain: XML in the Wilderness
Joe Gollner
Vice President
Stilo International
Likes to present "gory details on big projects gone wrong." I like him already.
The wilderness archetype is present in many different cultures. Going into the wilderness forces a person to change.
Next slide...the Patron Saint of Content Management! St. Jerome is officially the patron saint of libraries, librarians, archivists, and encyclopaedists.
And now, we're going to talk about what St. Jerome and XML have in common.
Oh, my goodness, his license plate reads: XML
Even better, his wife got it for him. I don't know either of them, but I predict a long and happy marriage.
And we're off to a cruise through the history of content processing. Some very cool information, but impossible to translate into a blog without his slides. (Check the DocTrain web site for slide decks; his are not posted at the moment.)
Now a discussion of SGML, what it achieved, and why it was hard for developers.
Here's an interesting bit about XML:
"The driving focus for XML has been facilitating a revolution in the way technology applications are designed, developed,and deployed."And critically, we're now talking about technology and XML, not content and XML.
And this has enabled the so-called Web 2.0. Joe is focusing on the fact that you can build very quickly and stay in "perpetual beta" in the "participatory web." People don't often talk about how XML-based technologies are what is making Web 2.0 possible.
What does XML mean for authors? Two contradictory challenges:
- Too much markup, which gets in the way of creating content, forces a reliance on unfamiliar tools, and adds a level of technical complexity to what is a creative task.
- Not enough markup...some content demands precision. Authors need clear guidance and useful feedback in order to satisfy this demand. As more content is delivered to applications, this is more common.
- Restrictions on syntax (XML took away some of the options that were in SGML to make it easier for computers to process.)
- Models mirror communication patterns less naturally than before
- New language (XML Schema) for declaring rules
- Schema modeling tools not helpful for content modeling
- XML is verbose
- Complexities reintroduced and application challenges remain
- Happy!
- Single sourcing
- Multiformat automatic publishing
He somewhat likes DITA, especially because it's an "assemblage of SGML Dirty Tricks." DITA gives us the ability to handle variability and change. DITA's approach is simple markup by default, but specialization allows for more specific markup.
XML has been in the (data) wilderness, but now it is finally returning home to where it should be (content). And DITA represents a serious effort in that direction.
St. Jerome went into the Syrian desert, learned Hebrew, and was able to create a new Latin translation of the bible (Vulgate). Likewise, XML has learned some things from life in the data world.
If you're looking for more coverage, Anne Gentle is sitting next to me with her laptop.
I also found Richard Hamilton, Antoine Giraud, and Scott Nesbitt. And someone writing Boarding the DocTrain.
Kudos to the DocTrain team for picking a lovely city and hotel. And for providing wireless coverage in the ballrooms!
Labels: doctrainwest08, xml
WritersUA: DITA pilot techniques
Mark Wallis of IBM ISS on how to run a successful DITA pilot. Some great information in this presentation on how to reduce risks.
He recommends selecting your pilot project based on the following items:
- Right timeframe -- don't choose the project that has an imminent release
- Choose a manageable documentation set size
- Reduce risk by avoiding the strongest (or most critical) product
- Identify a product with a known need to improve the user experience
The ideal team for a pilot will need cross-functional and complementary skills:
- Project management skills
- Tools and technology strengths
- Product knowledge and understanding
- Architecture and design skills
- Editor for standards and styles
- No autopilot writing
- Don't just migrate existing content; you'll get trapped in old paradigms (this assumes that existing content does not fit the DITA topic paradigm)
- Perform use case analysis and task analysis
- Determine the critical scenarios to document
- Focus on tasks; backfill supporting information as needed
They set up a DITA War Room in a small conference room and met at least daily (1.5 to 2 hours per day. Yikes). They set weekly goals and used small tasks to build momentum.
There was also heavy use of an internal wiki to put up initial "straw man" design, then revise, comment, and discuss.
Layering deliverables
Implementation deliverables were split out into smaller tasks, such as:
- Creating topic files, links, and navigation
- Testing links from code and navigation
- Creating task and reference topics
- Validating help against the user interface
- Creating concept topics for principles, guidelines, and best practices ("deep concept")
- Validating content in the expert community
Choosing the DITA toolset
Task Modeler (free) for building and managing ditamaps, defining relationships between topics, and creating skeleton topics (stub files).
DITA-compliant editor to edit your topics.
Compiler (part of open source toolkit). Compiler? What are they compiling? HTML Help? Oh. He just referred to Ant as a compiler. Ohhhhhkay.
Proof of concept
They picked a subset of the pilot to do the proof of concept.
The presenter's boss is quoted as saying, "There's no such thing as bad weather, only insufficient clothing." I'm guessing that she's never been to Minnesota in winter.
The objectives for the proof of concept:
- Learn and evaluate tools
- Address technical obstacles
- Specify end-to-end requirements
Managing costs
Purchase toolsets only for pilot team.
After completing proof of concept (successfully!), invest in tools for the remaining writers.
Wiki
They used their wiki to capture conventions and guidelines.
Improving acceptance
They paid attention to the change management issues. He doesn't mention it here, but I would assume that the combination of an acquisition by IBM plus the requirement to change the authoring environment could have caused significant angst. Their approach included presentations, wiki content, email discussions, and online training.
At the point of transition, DITA boot camp was offered.
They used collaborative walkthroughs, or reviews, to help standardize their content development. Interesting. This sounds as though it could be a) threatening and b) an unbelievable time sink. But just maybe it might also c) help improve the content.
Other lessons learned
Think more, write less. (Don't document the obvious, don't document common user interface convention, write only if you're really adding value.)
Don't squander your ignorance. (If something makes you stumble in the interface, that will probably also cause problems for your users, so capture it.)
The more structured your content, the easier the transition to DITA.
Documenting the obvious teaches readers to ignore your text, so don't document the obvious.
The handouts are available here: http://www.writersua.com/ohc/suppmatl/
Labels: change management, dita, writersua2008, xml
WritersUA: Day 3, Morning
Dave Gash (hypertrain.com) leads off the festivities with a discussion of the UA Holy Grail. And no, it's not DITA.
He is discussing True Separation of Content, Structure, Format, and Behavior.
Interesting, because we normally hear about separation of content and presentation -- he's making finer distinctions.
According to Dave, the current authoring method is to using WYSIWYG and code editors, often in combination. And as we work, we insert what's needed wherever it's needed. The result is that documents work -- once -- but are very difficult or impossible to update, maintain, and control.
Spaghetti-code documents make our own jobs harder.
The conventional wisdom is to separate content and formatting. Content is "stuff on the page"; therefore format must be "everything that is not content."
Content could include HTML, CSS, and JavaScript. Separating out CSS still leaves "junk" in the content pages.
Dave proposes a more refined model: content, structure, formatting, and behavior.
* Content is XML
* Structure is XSLT
* Format is CSS
* Behavior is JavaScript (JS)
This will be more maintainable, which means:
* Ability to change any components without breaking the others
* Ability to reuse any component in other pages or projects
* Ability to control each component's resource allocation (that is, who creates each piece?)
How to improve your pages:
1. Identify and externalize JS behavior.
* Find the embedded scripts (<script> tags) and remove them with a reference to an external foo.js file.
<script language="javascript" src="foo.js"></script>
2. Identify JS behavior that could be CSS and convert it to CSS rules.
"If you can encode with CSS and make it declarative instead of procedural, you're way ahead of the game."
* Catch "sneaky" JavaScript behavior, such as mouseover events, that could be CSS rather than JavaScript. Event handlers that call JavaScript almost always start with "on" -- easy to identify and many can be replaced with CSS hover pseudoclasses.
.expterm:hover {font-style:italic; }
.expterm {text-decoration:none;}
Removing the code from the HTML greatly simplifies the page.
3. Identify and externalize CSS styles, recode any local formatting as classes.
Get rid of "deprecated tags and doo-doo like that."
Get rid of style attributes, font tags, b tags (become span tags).
"It's said that comments are for someone who comes behind you six months later and needs to update your code. This is not true. Comments are so that YOU can figure out six months later what you were doing in the code."
So you should comment your code.
4. Semantically mark up content as XML.
Dave's definition of semantic markup? "call things what they are."
5. Identify desired HTML output structure, write XSL transforms to produce it.
So...what's in it for me?
Discrete, maintainable, controllable components
* you can change one component without breaking others
* You can share components with other pages
* You can separate work load by skill sets
* Set it and forget it! (for everything except the content)
Code examples are available at Dave's web site: www.hypertrain.com
Questions about tools. No, he won't recommend tools. Question about schemas...Dave says the first thing that comes to mind is...DocBook???
Yikes. In an answer to a question about print and XSL-FO, somebody recommended asking....me! (I swear I didn't pay her for that, and I don't think she even knew I was in the room. Quite surreal.)
##
My only disagreement with this session is with the separation of XML as "content" and XSLT as "structure." It's my opinion that the XML includes the structure, and XSLT just gives me a way to express that structure into HTML or other formats.
I also question some of his tag names, such as <expander> for a term/definition group. The expander tag name is really a description of the desired behavior (expandable text) rather than the semantic function of the content (definition of a term). I would probably choose something like <glossaryitem> for the container, leaving opening the option of changing the behavior to something other than expansion in the future. Same quibble with <ddblock> (drop-down block).
I do like the use of the
Great presentation from an energetic presenter whose motto is, "If I have to be awake, you do, too!"
Side note: I'm pretty sure that if you tied Dave's hands behind his back, he would lose his ability to speak.
Labels: presentations, writersua2008, xml, xsl
XFL: He Hate Me Not
(For those of you with a life, the title is a reference to this.)
According to Colin Moock, the next version of Flash will have an XML-based format. He writes:
Flash CS4 will be able to export *and* import a new source format called XFL. An XFL file is a .zip file that contains the source material for a Flash document. Within the .zip file resides an XML file describing the structure of the document and a folder with the document's assets (graphics, sounds, etc). The exact details of the XFL format are not yet available, but Richard [Galvan, Flash authoring-product manager] assures me that Adobe intends to document them publicly, allowing third-party tools to import and export XFL.This is important. Currently, it's fairly impossible to integrate Flash and non-Flash content. Other than, of course, with our 80s friend, Mr. Cut-and-Paste.
If Flash speaks XML, we can develop a process along these lines:
And that has major implications for development of e-learning content and other things that you might expect to find in Flash. At some point, when it's not five minutes before the Duke-Carolina game, I'll try to be more specific.(h/t John Nack)
PS "Carolina Goodnight"? I don't think so. See note 4.
"Once you start down the DITA path, forever will it dominate your destiny"
Eliot Kimber has a lovely article on using DITA for narrative documents. I'm trundling through it, nodding in agreement, and then we have this horror:
[...] DITA offers at least two compelling advantages over any other candidate XML application:Now, he does qualify this statement by saying that these assertions apply only if DITA is a reasonable fit for your problem. But the overall thrust of the argument appears to be that since DITA can do narrative documents (which it was emphatically not designed for), it can potentially be applied to an enormous new set of content.
- The initial cost of ownership is low, approaching zero, and the ongoing cost of ownership is low.
- It offers a number of sophisticated features in terms of modularity, extensibility, and linking that either are not provided by other applications or would cost a prohibitively large amount to build from scratch.
That is, the cost of applying DITA is almost always going to be significantly lower than the cost of any alternative (and at worst will be no more expensive than any other alternative).
Ugh.
Before I begin today's DITA-bashing session, I need to point out that we are currently using DITA for several projects here at Scriptorium. DITA slices! DITA dices! DITA advocacy raises your IQ, improves your health, and makes you irresistible. I like DITA just fine.
Moving right along...
"1. The initial cost of ownership is low, approaching zero, and the ongoing cost of ownership is low."
Just because it's free doesn't mean it's cheap. The default output from the DITA Open Toolkit ranges somewhere between unattractive (HTML) and fugly (PDF). If you care about the appearance of your final documents, you are going to have to do a lot of work to get the look and feel you want. And although the OT offers a starting point, customizing it is kind of like a trip to the dentist. The impacted-wisdom-tooth-removing kind of trip.
Getting your output working properly is Not Easy because of the, er, unique design of the OT. If the set of tags you need is small, you might be better off building a nice petite NovelML and then writing the transformations you need for NovelML instead of wrestling with DITA's complexities.
"2. It offers a number of sophisticated features in terms of modularity, extensibility, and linking that either are not provided by other applications or would cost a prohibitively large amount to build from scratch."
I agree that DITA has some lovely features in this area. However, I fail to see how they apply to the example at hand -- a narrative document such as Moby Dick. If you need modularity, extensibility, and linking features, you should consider DITA. If you don't, then these features will just get in the way.
That is, the cost of applying DITA is almost always going to be significantly lower than the cost of any alternative (and at worst will be no more expensive than any other alternative).If DITA is overkill for your requirements, then applying DITA may not be cheaper.
But the issue that upsets me the most is this: when you attack a problem by assuming (or hoping) that DITA will work, you necessarily look for DITA features you can use. You may not think carefully about non-DITA features that you might like to have. For fiction content, I can think of several things that would be quite useful (and for which DITA offers no immediate support):
- For a book that is part of a series (like a science fiction trilogy), a listing of the entire series and an indication of where the current book falls in the series.
- Metadata to identify the point of view. Many novels switch from one narrator to another, or from a first-person point of view to an omniscient point of view. It would be lovely to filter the content to see only the first-person content (after reading the book from cover to cover as the author intended).
- Similarly, metadata that helps with scene location and time could be invaluable for studying literature written with numerous flashbacks. The Time Traveler's Wife and anything by Jasper Fforde come to mind.
- The ability to index by character occurrence. This is more often seen in nonfiction books, especially biographies. But imagine scanning the entire Harry Potter series for scenes with Severus Snape to determine whether his ultimate allegiance was consistent.
As Eliot says, the advantages of DITA can be significant. But I fear that a generation of documents will be crammed into DITA, resulting in documents that are not as well structured as they need to be.
I will now await my smackdown from the DITA Disciples.
Signed,
DITA Dissident
Crickets
It's been a busy couple of months and my blogging has suffered accordingly.
However, I do have a new article available in STC's magazine, Intercom. I will be writing a regular column entitled "The XML Strategist."
The first installment is "When is XML the Wrong Answer?" If are you are an STC member, you should be able to read the article online as a PDF here. Here is a short excerpt:
XML offers some interesting features, but are they of value to your workflow? If you are happy with your current authoring and publishing system, and nothing is compelling you to move to XML, why make the effort? The XML tools are not as mature as “traditional” desktop publishing tools. Over time, the cost of implementing
an XML-based workflow will drop, and your business case will look more attractive.
I welcome comments and ideas for future columns.
Labels: xml
Reactions to the TechComm Suite
Bloggers are starting to comment on the TC Suite. Here are a few I spotted this morning:
Bill Swallow ("TechCommDood") writes on waxing techcomm:
I'll admit, I'm both impressed at the package (the monetary deal for the payload of technology is quite appealing) and at Adobe's direct acknowledgement of the techcomm market. [...]This is an important point (and a highly problematic one). If you link your FrameMaker content into a RoboHelp project and then make changes to the FrameMaker-sourced content in RoboHelp, then you end up with two copies of the content. Not good, and the temptation to just "tweak a few things" is always there. (I'd be happy to be proven wrong on this point.)
The workflow is still unidirectional; FrameMaker to RoboHelp to online output. There is no going back from RoboHelp should you make changes (which you can, since RoboHelp also remains an authoring tool) once you import the FrameMaker content.
This is where the similarities between RoboHelp and the likes of WebWorks Publisher and Mif2Go end. RoboHelp allows you the option to continue to edit content in the built-in (or external) HTML editor after import.
Bob Doyle writes on his techwr-l.com blog:
You can include Help in FrameMaker projects, eLearning in RoboHelp and in Frame, 3D animations in Help and Frame and in PDF documents, RoboHelp screen captures from Frame, etc, etc. All the tools include direct access to aspects of the others from within the tool. You do not have to leave one tool to “Edit with…” another tool. And no longer are conversions needed to reuse assets.This is the first reference I've seen to reusing RoboHelp content in FrameMaker. I don't believe that this is actually possible.
Another positive initial review from Ron Miller:
[...] Adobe appears to have taken care to put integration on the front burner to make it easier for training and tech writing departments to share content.Dan Ortega of Astoria (via Charles Jeter) clearly identifies the strategic problem with the Suite:[... T]hey appear to have answered all the criticisms I had of RH6 and then some with RH 7. What's more they have integrated it with Frame to create a fully featured publishing environment.
Until I take it through its paces with a project, it's hard to judge but the first impressions were good and it appears clear that Adobe wants to claim a place in the tech writing market.
Adobe's products are evolving and becoming more integrated, but they are doing so inside the Adobe walls. Conveniently, FrameMaker and RoboHelp are now neighbor, where before they were more like rival gangs with a turf war. But the XML and XSL barbarians are at the gates, and it's time to let them in and accept them as citizens. (This metaphor has clearly run, er, amok.)[...] Adobe still appears to be focused on a desktop paradigm. [... W]hen they reference workflow, they refer to workflow integration between the products in the TC Suite. [...]
If Adobe plans to succeed in the enterprise, they have to take a much broader view of how technical documentation teams work by moving beyond the creation perspective. They need to adopt a perspective that encompasses the entire production cycle[...].
The era of proprietary content files is over. Baseline content needs to be in XML because of the "production cycle" that Mr. Ortega describes. XML is:
- Supported by content management systems
- Advantageous for localization workflows
- Enforceable (that is, you can enforce your preferred structure)
- An excellent starting point for automated content production (via XSL, FrameMaker, or even InDesign)
Labels: FrameMaker, robohelp, TechComm Suite, xml
Inside our XML workshop...
Leanne Rollins of the Southwestern Ontario STC posted a summary of the workshop I did in March. It gives you an excellent flavor of what these workshops are Really Like.
I particularly enjoyed this bit:
The planning requirements and cost implementation alone were enough to scare the entire the room into reassessing their *actual* authoring and publishing needs.I call that success.
Labels: change management, xml
When you have a hammer...
...everything looks like a nail.
We all suffer from this syndrome to a certain extent. Once you develop familiarity with a particular tool or technology, you see possibilities everywhere.
Sean McGrath refers to this as the Just Use X Club. He is none too happy with the rising membership in Just Use X and proposes a counter-organization:
A second club needs to be formed called "When not to Just Use X" club. This group should devote itself to taking all values of X from the "just use X" club and listing off all the scenarios in which each X should not be used. They should also list off all of the things which will not automatically be true by virtue of the use of X. They should also list off all the areas where good old fashioned thought and design and hard work cannot be replaced by the simple gambit of using X.This fall's Intercom will launch my new column, tentatively titled The XML Strategist. The first installment is devoted to scenarios in which XML is not appropriate.
It would appear that Mr. McGrath and I are kindred spirits.
Labels: change management, xml
Understanding change resistance
Implementing new technology presents numerous challenges -- choosing new software, training staff on new technology and processes, setting up new workflows, and so on. For technical writers, the transition from traditional desktop publishing to XML-based workflows requires a significant shift in mindset. Instead of focusing on the appearance of the final deliverable (usually on paper), writers must now give up control over formatting, follow a set of structure rules, and assume that the end result will be formatted automatically.
You should not underestimate the difficulty that this transition presents. With that, I was disappointed to see the following at Accelerated Authoring:
If Pete decides to go for DITA, he’ll have to [...] persuade management, get a budget, train writers and figure out how to manage the transition. Not easy. And, if the transition is not smooth, Pete could be penalized.No.
On the other hand, Pete could get through the transition period to DITA and leverage the same team that he had yesterday to produce more documents, more focused documents, better documents. Is there risk in the transition? Of course, but that’s what life is about - adapt or disappear.
"Pete" must first determine that the benefits of XML-based authoring outweigh the costs. Then, Pete needs to think about whether DITA is the right choice for his organization's content.
DITA is not right for everyone. XML is not right for everyone.
Keep in mind that the benefits of XML generally go to management and the difficulties (worse tools, less control, more constrained authoring) are imposed on authors.
If you're interested in m


