Palimpsest has moved. Please visit our blog in its new location for the most recent posts from Scriptorium.
Palimpsest
Learn DITA and XML at your desk
Monday, August 10, 2009 — posted by Sarah O'Keefe
Labels: dita, DITA Open Toolkit, PDF, webcasts, xml
9:02 AM Permalink | |

Top five reasons to like XMetal and oXygen
Thursday, June 11, 2009 — posted by Sheila Loring
Full disclosure: We're an XMetaL Services Provider and have no particular affiliation with oXygen.I'm in the fortunate situation of having access to both XMetaL 5.5 and oXygen 9.3. Both are excellent XML editors for different reasons. I'd hate for Scriptorium to make me choose one over the other.
From the viewpoint of authoring XML and XSLT, here are my top five features of both editors:
oXygen
- Apply XSLT on the fly: You can associate an XML file with an XSLT and transform the XML within oXygen. Goodbye, command line! XMetaL will convert the document to a selected output format. You don't choose the XSLT--it hasn't been a big concern for me.
- Indented code: The pretty-print option makes working with code so easy. You can set oXygen to do this automatically when you open a file or on demand. The result is code indented according to the structure. XMetaL doesn't have pretty print.
- Autocompleting tags:
As you type an element, oXygen pops up a list of elements beginning with the typed string. You press Enter when you find the right tag, and the end tag is inserted for you. The valid attributes at any particular point are also shown in a drop-down list. XMetaL doesn't have autocompleting tags.
- Find/replace in one or more documents: I've often needed to search and replace strings in an entire directory. In XMetaL, you can only find and replace in the current document.
- Comparing two documents or directories: Compare files by content or timestamp. In a directory, you can even filter by type so only XML files, for example, are compared. XMetaL doesn't offer this feature.
- Auto-tagging content: You can copy and paste content from an unstructured document (a web page, for example), and XMetaL automatically wraps the content in elements. Even tables and lists are wrapped correctly. This can be handy if you have a few documents to convert. In oXygen, the content is pasted as plain text.
- Auto-assignment of ID attributes: Never worry about coming up with unique IDs. XMetaL will assign them to the types of elements you select. Warning: The strings are quite long, as in "topic_BBEC2A36C97A4CADB130784380036FD6." oXygen only inserts IDs on the top-level element but full support will be added in version 10.3.
- Auto-insertion of basic elements: When you create a document, XMetaL inserts placeholders for elements such as title, shordesc, body, and p. It's a small convenience. oXygen will also insert elements if you have Content Completion selected in the Preferences.
- WYSIWYG view of tables: The table is displayed as you'd see it in a Word or FrameMaker document. In oXygen, all you see are the table element tags.
- Reader-friendly tag view: The tags are a bit easier to read in XMetaL than oXygen. In XMetaL, the opening and closing tags are displayed on one line when possible.
This feature saves space on the page and makes the document easier to read in tag view. For example, you might have a short sentence wrapped in p tags. In XMetal, the p tags are displayed on the same line. In oXygen, the p tags are always on separate lines. This is another convenience that doesn't sound like a big deal, but it really makes a difference while you're authoring.
What I'd like to see in XMetaL: The ability to indent code, the ability to drag and drop topics in the map editor.
What's I'd like to see in oXygen: The ability to view a table--lines and all--in the WYSIWYG view instead of just the element tags.
So how do I choose which editor to use at a particular moment? When I'm casually authoring in XML, I choose XMetaL for all of reasons you read above. The WYSIWYG view is more user-friendly to me. But when I'm writing XSLT or just want to get at the code of an XML document, oXygen is my choice.
Get the scoop on oXygen from http://oxygenxml.com. Read more about XMetaL at http://na.justsystems.com/index.php.
Update 6/15/09:
I'm thrilled to report that two deficiencies I reported in oXygen 9 are now supported in the latest version of oXygen -- 10.2.
- In Author view, tables are now displayed in WYSIWYG format. Just like in your favorite word processor, you can drag and drop column rulings to resize columns. After you resize columns, the colwidth attribute in the colspec element is updated automatically. This is much easier than manually editing the colwidth.
- In Author view, the tags are now displayed on one line when possible. Before, the tags were always on separate lines from the content.
Labels: analysis, oxygen, reviews, xmetal, xml, XSLT
1:54 PM Permalink | |

Death to Recipes!
Tuesday, May 19, 2009 — posted by Sarah O'Keefe
I love food. I enjoy cooking and I especially enjoy eating. One of my favorite web sites is epicurious.com, and the kitchen shelf devoted to cookbooks sags alarmingly. Many Saturday mornings, you will find me here.But I am not happy about how recipes have insinuated themselves into my work life. For some reason, the recipe is the default example of structured content. Look at what happens when you search Google for xml recipe example. Recipes are everywhere, not unlike high fructose corn syrup. Unfortunately, I am not immune to the XML recipe infiltration myself.
I understand the appeal. Recipes are:
- highly structured content
- well understood
I'm considering using a glossary as an example. After all, it's a highly structure piece of content whose organization is well understood. Maybe I'll use food items as my glossary entries. Baby steps...
PS It's totally unrelated, but this article about two chefs eating their way through Durham ("nine restaurants in one night, at least five hours of eating and drinking") is quite fun.
11:17 AM Permalink | |

The evolution of books
Tuesday, February 17, 2009 — posted by Ethan Duty
Sarah O'Keefe has brought a lot of attention to how XML and structured authoring are revolutionizing the economy of authoring, but there has not been a lot of discussion about the paradigm shift being experienced by readers.Some might find this frightening, but mobile technology like iPods and smartphones are replacing printed documentation. My iPod has already replaced my need for printed reference materials. Yes, that's right, my iPod has replaced just about every printed bit of information I need.
My iPod is my atlas, calendar, cookbook, dictionary, encyclopedia, newspaper, shopping list, translator, and TV guide. I can find all the how-to and instructional media I want online. Better yet, I don't even have to read most of it. YouTube and similar sites are filled with videos on everything from folding your clothes more efficiently to off-pan VW restorations.
The information is constantly updated and spans an international audience (try finding books in Mandarin at your local Barnes and Noble). It doesn't take up extra space on my book shelf or coffee table and it reduces the amount of paper I throw away.
There are only a few reasons why I still have to reach for paper:
- I need to look at or write on a bigger page
- I want to leave a note somewhere for others to see (I'm not about to tape my iPod to a coworker's door like a Post-It note)
- I can't mark up a document on my iPod (although you can do this on a tablet PC)
- I can't compare two documents side by side
- writing can be faster (typing on an iPod, even with a stylus, is a slow and difficult operation)
- I can't find an Internet connection
- The battery died
A sci-fi dream for the distant future? I don't know. I'm not too far from living a paperless life already. How about you?
Labels: predictions, xml
3:50 PM Permalink | |

DITA webinar now available...
Tuesday, February 10, 2009 — posted by Sarah O'Keefe
If you just want the slides, they are embedded below via Slideshare.
DITA 101 -- Why the Buzz
Labels: dita, presentations, xml
10:48 AM Permalink | |

Essential tools of an XML workflow in the publishing industry
Monday, February 02, 2009 — posted by Sheila Loring
Communications from DMN provided a link to a webcast on Essential Tools of an XML Workflow. The webcast focuses on the book publishing industry. It's interesting to hear that some publishing houses still allow authors and editors to use Microsoft Word. These folks are often viewed as incapable of learning an XML authoring tool. Many times the Word content is sent to an indexer for tagging.The companies I've worked with don't give their employees the choice of publishing tools, but if you're Stephen King, you probably won't be forced to use an XML tool.
Technical writers, if you know how to work with XML, your skills are portable to publishing houses. Don't overlook this in a job search.
http://toc.oreilly.com/2009/01/webcast-video-essential-tools.html
Labels: Microsoft Word, publishing, xml
10:34 AM Permalink | |

Take our survey on structured authoring--and get a free report on the results
Thursday, January 29, 2009 — posted by Alan Pringle
Curious about other folks' experiences with considering, planning, and implementing a structured authoring/XML environment? Well, now is your chance to get that information by participating in our survey on structured authoring.We want input from everyone: those who have implemented structured authoring, are planning to implement it, or have decided against it. The short survey will take no more than 10 minutes of your time. The deadline for responses is March 1, 2009.
In April, we will release our analysis of the results. If you participate in the survey and provide your contact information (which is entirely optional), we will give you a free copy of the report, which will cost $200. We are also going to give a $50 amazon.com gift certificate to two randomly selected people who provide contact information. (By the way, if you provide your contact information, we will not share that information with any other company.)
Take the survey today. We appreciate your help.
Labels: structured authoring, survey, xml
8:30 AM Permalink | |

Beat the post-holiday blahs
Wednesday, December 03, 2008 — posted by Sarah O'Keefe
It's never too early to start thinking about fun things to do in February.On Thursday, February 5, 2009, at 9 a.m. Pacific Time (noon Eastern time/5 p.m. London time/etc.), I'll be offering a webinar in conjunction with Madcap Software. Not sure this qualifies as "fun," but it's better than complaining about the weather, which is our major activity here in late winter.
DITA 101: Why the Buzz? DITA, the Darwin Information Typing Architecture, is the new buzzword in technical communication. But why? In this webinar, you'll learn about DITA concepts, business case, and typical scenarios where DITA is used.If you are not at all familiar with DITA and want some introductory information, join me for this session.
You can then evaluate for yourself whether DITA makes sense for your content. Best of all, the webinar is free, which is the right price in this economic climate.
The webinar is free, but registration is required here.
Labels: dita, madcap, presentations, xml
8:00 AM Permalink | |

WYSIwar?
Monday, November 17, 2008 — posted by Sarah O'Keefe
Recently, there is a dispute brewing over the relative merits of WYSIWYG and WYSIOO (What You See Is One Option).In the WYSIWYG corner, we have Vivek Jain, Group Product Manager, Technical Communication products, of Adobe. He recently posted this:
WYSIWYG (What you see is what you get) is the hallmark of a good authoring and publishing tool. Publishing process is the last stage of any technical communication project. Having a WYSIWYG tool, whether you are authoring for Print, PDF or HTML, enables you to preview the expected output as you develop the content and reduces surprises in the end. Surprises during the publishing stage are often the reason for project delays and nightmare for all project members.And in the WYSIOO corner, we have Sean Wheller with an excellent overview of the argument for XML, which includes a discussion of WYSIOO and WYSIWYG:
Jain counterpunches:[The WYSIWYG] approach is not possible with XML since it was designed to describe data and to focus on what data is, not how data will be displayed or formatted. Authors composing texts stored in XML must use a "Structured Editor." This means the editor is focused on two tasks: text composition and valid markup thereof. Since formatting information cannot be saved inside the XML file, the best an editor can do is render temporary formatting and layout for the duration of the editing session. What authors see during the editing session is "One Option" of what they may get in the final product, WYSIOO.
WYSIOO and WYSIWYG are opposite ends of the same stick. The two end-points cannot be brought together without breaking the stick. Try as you may, if you do manage to bring these polarities together you will find that you have violated the fundamental laws of one or both of these paradigms.
[W]hile authoring XML documents in FrameMaker, the styles are applied through a template. [...] You can replace your [...] template at any time during authoring or publishing.And this is where I join the fray and attempt to provide some additional perspective. WYSIWYG has been the norm for around 20 years as part of the desktop publishing paradigm. Desktop publishing became the dominant paradigm for lots of fun reasons that I won't get into here. (Wikipedia has the scoop, as usual.)
I have often found the argument about WYSIOO (What you see is one option) very hard to understand. If you are publishing for two channels, you can use two templates to preview the final output (WYSIWYG view for two channels). In any case, you need these two templates for publishing, why not share these templates with the authors.
[...] If you can get WYSIWYG for both print and HTML and synchronized content, isn't WYSIOO an argument in favor of poor authoring experience.
Today, the entire publishing industry is in the throes of a shift from desktop publishing to structured authoring. Our Structured Authoring and XML white paper has all the gory details. When you start focusing on document structure rather than on page (or screen) presentation, your priorities change.
Dismissing WYSIOO as simply a "poor authoring experience" is too simplistic. In a collaborative, distributed, modular authoring environment, it may be impossible to provide a WYSIWYG view because each end user might see a different version of a document. A simple example would be a web site that allows different users to choose the look and feel of the document instead of accepting the formatting choices that the web site developer made. For print, the final pagination of a document would depend on which modules are assembled for the document, and if the end user has the ability to choose which content to include, the author can't predict where the page breaks will fall in the document.
Finally, there is the question of the author's mindset. If you author in an environment that shows you a WYSIWYG print representation of your final document (as in FrameMaker), would that cause you to focus more on printed deliverables rather than considering all the possible destinations for your content, such as print and PDF (which could be different!), HTML, CHM, Eclipse help, and so on?
We are moving away from providing static deliverables, such as HTML pages or PDF files. Instead, readers may be able to control presentation, change fonts, aggregate content, and assemble their own deliverables. In this environment, we need WYSIOO.
PS I'd really like to know who coined WYSIOO. The earliest reference I've found is 2004.
4:00 PM Permalink | |

Structure: resistance is futile (and a waste of time, really)
Tuesday, September 23, 2008 — posted by Alan Pringle
It's been our experience that there are always a few writers in a technical writing department who will resist adapting to a new structured authoring workflow. Apparently, that sort of behavior is not limited to writers:It is common among both developers and technical writers who work with xml to look on schemas (the XSD files) as a necessary evil. Something they grudgingly have to accept because it’s been directed from high, but not something they would use if they didn’t have to. They do their coding, or their writing, first, and then try to force it to fit the schema.
...
You might think it strange that I am picking on developers in a technical writing blog, but let me tell you, a lot of tech writers have the same attitude. The same people who happily accept that a style manual is necessary, and that a document template is compulsory, consider the schema a nuisance—something that makes their job harder, restricts their freedom and adds extra work.
It's nice to see this topic addressed so directly and succinctly. Yes, the learning curve for working in a structured environment can be a bit of challenge, particularly if you've been writing in a template-based workflow for a long time. Also, there are some cases of poorly planned and implemented structures: if many authors are struggling with a structure, that may indicate there is a problem with the hierarchy itself. Beyond that, however, it is much harder to justify (or understand) the behavior of authors or developers who resist following an established structure, even though it's there to make work easier.
Labels: xml
12:27 PM Permalink | |

Tout de Suite...too many suites?
Wednesday, September 10, 2008 — posted by Sarah
You may have missed Madcap's recent announcements of their sundry product upgrades somehow. Perhaps you were on a deep-sea expedition or out in the desert? I fully expect, though, that you would have received an announcement from Madcap via SMS on your satellite phone. But I digress...the topic of this post is not supposed to be the awesome power of the Madcap Marketing Machine™.The Adobe Army has the Tech Comm Suite to face off against the MadCap Minions with their MadPak. Adobe's marketing is a little less...um, aggressive than their competitor. And let's not forget Author-it, which describes Author-it itself as a tightly integrated product suite.
So many suites...so little time. I feel like a kid in a candy shop. (And I'm a bit of an expert on candy shops.)
Except for one problem. Take a look at my top three requirements for authoring software:
- Creates XML content in my preferred structure
- Validates XML content against my preferred structure
- Publishes XML content through XSL and XSL-FO to create HTML, PDF, and other deliverable formats
But as I've said in many publications and presentations
, the current trend is to take away publishing responsibilities from content creators. Instead of authoring books, authors are creating bits of content, which are then assembled into the final deliverables. And the use of a suite seems to go against that trend because authors are once again placed at the center of the publishing effort.
Am I the only one who would like to see a shift in focus?
PS I apologize to the French language. I am well aware that I completely murdered the translation of "tout de suite" -- which actually means "right away." I'm afraid I'm just powerless against the joy of really bad puns, especially really bad multilingual puns.
Labels: adobe, analysis, madcap, xml
3:14 PM Permalink | |

Learning the DITA Open Toolkit
Thursday, September 04, 2008 — posted by Sarah
(Scriptorium Publishing is a JustSystems Services Partner.)Simon Bate's webinar, An Overview of the DITA Open Toolkit, is now available. This event was jointly sponsored by Scriptorium Publishing and JustSystems. The recorded version is available here (registration required).
During the presentation, we did some audience polling.
Are you currently...? (choose one)
- 31%, Using XML
- 32%, Transitioning to XML
- 16%, Planning to transition
- 17%, Considering a transition
- 3%, Not considering it at all
- 59%, DITA
- 0%, DocBook
- 1%, Other publicly available
- 10%, Developed in-house
- 3%, Not considering it at all
I liked the last poll:
What formats do you currently or plan to publish to?
- 50%, print
- 92%, PDF for download
- 49%, web site
- 79%, online help
- 18%, other
The problem, from our customers' point of view, is that producing nice PDF from DITA content is really quite challenging. (From our point of view as consultants, this is not necessarily a bad thing.) What makes PDF so challenging? Basically, you are reverse engineering your layout engine (think FrameMaker or InDesign) in the XSL-FO programming language.
Simon's presentation provides an excellent introduction to the Open Toolkit, which many find quite intimidating. This was apparent from some of the questions and comments that Simon got:
Is there a GUI for OT that could be used by documentation production staff rather than command line?It's worth noting that running the Open Toolkit is vastly less difficult than configuring the Open Toolkit. The person doing the configuration work will need to understand Ant, type DOS commands (!), and rework the default transformation templates to produce the desired output. The person generating output with the configured OT will need to type in one command or just double-click a batch file to start processing.
I haven't typed a command into DOS in twenty years.
What's the difficulty level of using OT to get HTML output that is more professional-looking, like a WebWorks HTML generation?
Can you please define the purpose of ANT files?
Many of our customers have turned to us for the Scary Configuration Bits. If you're looking for help, keep us in mind.
This session was the second in a series of three webinars we are doing jointly with JustSystems. The last session, on September 23, will provide more details on customizing the DITA Open Toolkit. The webinar is free, but advance registration is required here. Hope to see you there.
Labels: dita, presentations, xmetal, xml
3:41 PM Permalink | |

Surprise! It's about quality.
Tuesday, August 19, 2008 — posted by Sarah
(Scriptorium Publishing is a JustSystems Services Partner.)On Monday, August 18, I delivered a webinar on making the transition from desktop publishing to structured authoring. This event was jointly sponsored by Scriptorium Publishing and JustSystems. The recorded version is available here (registration required).
During the presentation, we did some audience polling. And here, there were some surprises for me. We asked:
How are you authoring content today? (choose any)
- 31%, Word
- 14%, HTML authoring tool (i.e. Dreamweaver)
- 69%, Desktop publishing tool (i.e. unstructured FrameMaker, PageMaker, InDesign)
- 20%, Help authoring tool (i.e. RoboHelp, Flare, AuthorIT)
- 37%, XML authoring tool (i.e. XMetaL or structured FrameMaker)
In poll 2, things got very interesting:
What is the level of authoring at your organization? (choose one)
9%, #1. Chaos. No consistency
4%, #2. Documents match on paper
16%, #2.5 We have a template and sometimes follow it.
60%, #3. Template-based authoring. Repeatable process for creating consistently formatted documents
10%, #4. Structured authoring. Programmatic enforcement of required organization
When I ask this question a roomful of people, it's rare to get an admission of level 1. I've never seen anything like 10 percent of a live audience choose number 1. Perhaps the relative anonymity of a webinar is a contributor?
We asked some questions about skillsets with nothing of particular interest to report. Finally, we inquired about the business driver for structure implementation:
What is your critical business driver behind looking to improve how you manage content?
(choose one main driver)
10%, Speed up time-to-market
30%, Improve satisfaction with customer-facing documentation
3%, Comply with regulatory requirements
12%, Reduce localization cost
27%, Improve staff productivity
13%, Reduce production cost
4%, Other
The surprise here was that, at least in this group, the most single common response was a quality answer ("improve satisfaction") rather than a cost-reduction answer.
My session was the first in a series of three webinars we are doing jointly with JustSystems. The next two sessions will focus on the DITA Open Toolkit. Simon Bate, Senior Technical Consultant with Scriptorium, will deliver an overview of the Open Toolkit on August 26 and a session on troubleshooting and customizing the Open Toolkit on September 23. The webinars are free, but advance registration is required here. Hope to see you there.
Labels: change management, presentations, xmetal, xml
11:27 PM Permalink | |

XPubs: Information integration and the needs of the (product) maintainer
Tuesday, June 24, 2008 — posted by Sarah
Chris WoodBAE Systems
Tech pubs managers at BAE, contributor to S1000D standard.
Electronic maintenance (interactive electronic technical manual, or IETM) has been shown to deliver increase in fault finding success, reduction in troubleshooting time, and reduction in maintenance errors. "Fairly comforting"
Market drivers for integrated information...output-based contracts. The Royal Air Force is asking vendors to take on more maintenance activities. The drivers for success for the commercial organization are different from the drivers for the military.
BAE must guarantee that a specific number of aircraft ("platforms") are available to fly at all times. Financial penalties for not meeting those goals.
Offshore commodity outsourcing is putting pressure on the prices that BAE can quote. Price "per page" needs to be on a downward slope.
IETM capability offers an opportunity to integrate support information applications and processes.
ATTAC Contract = a certain number of Tornado aircraft must be available 24x7. BAE is responsible for preflight, postflight, AND other maintenance. Spare parts come out of BAE's budget. Therefore, reducing spare parts "footprint" saves money.
18 million pounds (double that for dollars) over 10 years. Their target is to save more than 18M pounds by including rich data (photos, video, 3D animation), align with actual maintenance activities, tech pubs people on-base as part of integrated engineering team.
Nice example of specific changes in tech docs leading to large cost savings due to fewer returns for repair.
Aha. They improved the official documentation by picking up information that was "plastered on the wall" in the aircraft hangar. In other words, user-generated content!
Information integration...the issue
Too much information, which is necessary and can be integrated, but...who generates it? where? who approves it? who can receive it? is there a recognized authority?
What about information generated by maintenance personnel for use by engineers (the stuff on the wall)? Is there an approval route? How authoritative is it?
In the past, the separation between maintenance and design authority was clear. As the maintenance and design operation moved closer (or become the same in BAE's case), the needed separation of content becomes much more challenging. Does linking from engineering authority content to non-engineering authority corrupt the authoritative content?
What level of authority does information have? Has it been tested? Have is gone through an approval route?
Approved data architecture. The challenge is to define a data architecture that includes all information issues by the design authority for the purpose of operating and/or mainteaining the platform in services, ensuring it is efficient, effective, and safe.
"This is a major content management issue." Indeed.
Many information deliverables go through rigorous approval process, but maintainers have access to other information, too. Official deliverables must be more integrated. Reference data and maintenance procedures come from different places in the organization, but they need to be in alignment. And there are "modifications," which must go in both places.
"This is not a trivial challenge." Yep.
The conflict here is really between data (approved content) and lore (unofficial information about how things really work). The mechanics have the "lore," and need to be persuaded to share it to improve the official documentation over time.
Labels: conferences, xml, xpubs
8:43 AM Permalink | |

XPubs: DITA implementation in progress
Monday, June 23, 2008 — posted by Sarah
Chris Hadley of Micro FocusNoz Urbin of Mekon
Micro Focus
12 writers in four locations
rapidly growing team, but also 20-year company veterans
Content is in XML, written using XMetaL, stored in CVS, DTDs and XSLT developed in-house.
Acquired companies have content in FrameMaker and Word.
Delivery in CHM, HTML, Help 2.
Lots of reuse in places; none in others.
No localization.
Problem very interesting because second generation of XML implementations are beginning. System was developed in-house, which was cutting edge at the time but now showing its age. It is costly to maintain, and the experts have left. Company cannot manage, debug, or improve the processes that exist.
Cannot continue to rely on in-house solutions because of risks of breakdowns. Change was required.
Business case 1: Value to customer
Improve quality of content with reuse, consistency, and accuracy.
Improve search. ("Content is great...when I can find it.")
Publishing to different formats.
Business case 2: Value to business
More accurate documentation with information easier to find means lower support cost due to fewer calls.
Lower cost on development. If support can find answers in documentation, there's no need to ask development team. Development team also uses documentation.
Reducing time on non-writing tasks such as builds means more time available for writing content.
Business case 3: Value to doc team
Replace aging systems and processes before they affect ability to deliver
Sustainable and manageable systems that can work even if there is employee turnover.
Align with industry standards (key reason). Open source expertise is transferable from other organizations, helps with recruitment.
Project started at the end of 2007. They engaged Mekon to help with the analysis of the current situation. By February 2008, they chose a CMS (TriSoft). Engaged Mekon to convert content to DITA. Installed and configured TriSoft. Primary consideration was cost, and they were required to give a budget figure to their board before they knew what they were going to be doing.
Why a CMS? They needed software that was supported externally instead of being produced in-house. Secondary priorities were reuse ("a single version of the truth") and publishing issues (multiple formats and audience, building and testing, link management).
Why DITA? Decision to DITA was very difficult. But the same arguments that applied to externally supported software applied to the DITA issues. Migration will be painful for acquired companies no matter what.
Why Mekon? Industry experience, skills range across the company, can talk about ROI, are independent, they are based locally (except Noz!).
Why the project will succeed. It must. Team is enthusiastic and motivated -- no issues with change resistance and recognized that the old system was unsustainable, already knew XML and XMetaL. Company's charter includes the phrase "debate passionately, get on board." And they did. Buy-in from the executive board was critical. Had to get senior VP to understand, support, and present to the board for approval. Early wins -- they already have demonstrable results. This is the right solution at the right time for this company.
Complexities and pitfalls
Didn't know what the end product was going to be. How do you learn what you need to when you don't know what you don't know? No in-house DITA experience, limited CMS knowledge. Also need to demonstrate results quickly while learning. Couldn't design perfect solution before starting, need to make changes during implementation. Balance between up-front research and management of progress expectations.
Content migration to DITA did not make the original timeline. Big reason for that is the "day job" -- writing content. So the initiative has to take a back seat to getting content out the door. Geographically dispersed team is difficult -- two locations haven't even looked at XML. Don't have a fully developed training structure yet.
The pilot project hasn't quite happened yet.
Preparing for migration is a horrible pain. Don't underestimate the cost of migration.
Publishing is complex, but it's taking time to get to a push-button process and it's not ready yet.
Planning to map to DITA, migrate to CMS. Will run old and new systems in parallel until they're sure that the new stuff is really working.
Need to re-architect content, integrate acquisitions, and refine and improve the process.
Business case for documentation team is very different from business case to board. The doc team's need for CMS needed to be presented in a way that would get approval.
This may not the right solution for everyone, but if it is, "get on with it."
CMS licenses -- some have high up-front cost but drop quickly per seat with bigger implementations. Others have low up-front cost but licensing per-seat doesn't drop much with greater volume. Makes cost evaluation different for different sized organization.
Another very interesting presentation. A rather aggressive timeline and the unique circumstance of lots of high-end skills (like XSLT development) in the documentation organization.
9:14 AM Permalink | |

XPubs: XSL-FO for Documentation Formatting
— posted by Sarah
Mike Miller, Antenna HouseFor starters, XSL-FO is an XML standard.
XSL-FO is "a pagination markup language describing a rendering vocabulary capturing the semantics of formatting information for paginated presentation." (Ken Holman)
Or, as I like to say, "A document layout described in a text file."
XSL-FO is black box formatting. Can't go back and "tweak" the files to fix them. With FO, you're typically talking about a minimum of a couple hundred pages. Much faster to render automatically rather than by hand in InDesign or FrameMaker.
First commercial products in 2001 from Antenna House and RenderX. Also, open source FOP from Apache in 2001. FO successful in the sense that both commercial companies are doing quite well.
FO more successful than any other technical publishing application other than perhaps TeX and FrameMaker. Probably attributable to the availability of open source (free) and trial versions from commercial vendors (free).
XSL-FO is only concerned with visual display of XML data, which means that the FO file has no semantic content, only formatting instructions.
The FO stylesheet specifies:
- page areas and sets of pages to be used to compose a document for paper (master pages)
- Text flows, areas on pages into which the text and graphics are filled
- Blocks within flow areas (paragraphs)
- Inline areas (character-level formatting)
- Processing and formatting are consistent and automatic.
- Formatting rules are stored separately from the data.
- FO is non-proprietary and human-readable (well, sort of)
- FO less complicated than programming Java or Perl and the like
- Can use stylesheets with different XSLT processors (DITA Open Toolkit)
- Easier integration with other XML standards compliant applications (not trivial, but much easier than other non-standard approaches)
Most business documents can be formatted automatically as FO. Rule of thumb: "If it's XML, FO can be applied."
Other applications for FO might include faxes, German railway tickets, correspondence from financial institutions and government.
Typesetting is very complex with issues like widows and orphans and hyphenation. Software can handle this. Human typesetters have been removed from the process, and this shows in amateurish mistakes. But you can use FO to configure something that follows typography rules and give you a professional look and feel.
"Overwhelming benefits" of using FO. Which begs the question: "Why aren't more people using it?" A slide with the benefits of XML showing The Usual (cost, time-to-market, less redundancy, standards-based, localization for cost justification, etc.).
People who use FO: auto manufacturers, cell phone manufacturers, banks, aerospace, government, military, educational
FO not appropriate for documents that are "artistically created."
FO extensions provide support for:
- Document info in PDF
- Bookmarks for PDF
- Column footnotes
- Revision bars
- MathML
- Embedding PDF within PDF
- Column rules
- Punctuation spacing
- Table autospace
- Floats
- Advanced hyphenation
- Barcodes
- several hundred extensions altogether. Antenna House uses multilingual requirements with extensions, such as special spacing requirements in Japanese or justification in Arabic through kashidas.
DITA Open Toolkit reduces complexity of getting set up and produce PDF. Could be configured and producing PDF in "a couple of hours." (Perhaps, but making it look the way you want is going to take a while.) According to Mike, somewhere between a few days and a few months, depending on the complexity of your requirements.
PDF output from DITA
- XSL-FO
- FrameMaker
- troff
- Preprocessing. Information is parsed and assembled.
- Transformation. Formatted and generated.
Why not FrameMaker or InDesign?
- Formatting is the tip of the iceberg. (WYSIWYG)
- WYDSIWYN -- What you don't see is what you need, which includes content management, automated formatting, multilingual formatting, global access, project tracking, electronic delivery, network integration
- You need to manually lay out pages.
- No fixed page style
- Need to modify page layout
- Unstructured document formats
- Document format is continuously changing
- Unstructured content
On the low end, FO is free with FOP. Antenna House is most expensive at $1250 for stand-alone or server license for $5,000.
FO supports more languages than any other solution currently available.
Solving the real problem:
- Improve the total process, not just individual tasks
- Improve organizational effectiveness
First question: Flowing text into typesetting engine results in line breaks that will cause readers difficulty. And this annoys him (as a professional typesetter). We want powerful, automated formatting AND the ability to do WYSIWYG tweaks. Thinks there is a role for a WYSIWYG stage after the automation bit.
I've noticed this on the BBC, too. British people ask really pointed questions.
And in response, Mike says that Antenna House has a solution for this where you create INX (InDesign XML) content (4 minutes) and then you can pull it into InDesign (half an hour), and do some cleanup.
Do all the XSL-FO tools cover 100% of the FO standard? "No, definitely not."
Labels: conferences, dita, xml, xpubs, xsl
7:50 AM Permalink | |

X-Pubs keynote: Transforming Legislation Publishing
— posted by Sarah
Brief introduction from Noz Urbina and an overview of the conference from Julian Murfitt. Some X-Pubs housekeeping items, including a flight announcement..."Should a presentation be boring and sleep deprivation set in, oxygen masks will drop from the ceiling. Please put on your own mask before assisting others."Hehe.
On to the keynote...John Sheridan, Head of e-Services at the Office of Public Sector Information, National Archives. Eeek, slight problem with slides -- and the presenter just launches right in without them. I bet he's terrified right now, but he looks perfectly composed.
We have slides. "Transforming Legislation Publishing"
Publishing legislation seem dry, but in fact it's quite relevant to the people at large -- and ignorance of the law is no excuse. Legislation documents use XML under the covers. Have been publishing legislation online since 1996 and of course print for a long time before that.
Strengths of their service:
- Immediacy: published online simultaneously with print versions. Important because some measures go into effect the day of their enactment.
- Accuracy: value of service hinges on knowing that the rendition of the online content is the same as the official vellum statement signed by the Queen. (Vellum? Really??)
- Trust: Do customers trust that what they see online is an official source? Based on eye-tracking software, they found when asked about trust, customers looked at the official crest and then responded positively.
- Reach: 1.5 million users, mostly in the UK

Key performance indicator: two clicks. 80 percent of information should be available in two clicks:
- Google search button.
- Click link on first page of results.
- Legacy workflows
- Multiple document inputs. Coming from Parliament, government lawyers in 21 departments, Scotland, secondary legislation, Welsh measures, legislation from Northern Ireland, church materials, dual languages in Wales (English and Welsh).
- Tools include: FrameMaker for some groups; Word for others
- Legacy content: 55,000 documents that needed to be repurposed from SGML to XML to improve web publishing.
Persistent linking, "web continuity", overall 60 percent of links to official information are broken. Their solution to "persist" the 500,000 existing links was to provide redirection behavior, so that every URL resolves either to live content or to the government's archive on the web.
XML is the key to solving these assorted issues.
Trying to "future-proof" their work, especially by providing a way to allow for changing web standards (HTML/web standard may change, but we can keep underlying XML).
Legislative documents are highly structured but also have variations over time. Very difficult to capture in a structure. "Parliament trumps your XML schema." You can't say, "Sorry, but that won't work in our schema, so you can't pass that legislation." Must find the balance between accommodating what's needed and "allowing everything."
They developed Crown XML:
The Crown XML Schema for Legislation provides a full and comprehensive encoding for all United Kingdom primary and secondary legislation. It has been written using the World Wide Web Consortium XML Schema language and is the Government's official and authoritative data standard for legislation. Once a piece of legislation has been enacted or made, it is stored using this Schema format. Schema compliant legislation is available in XML for onward supply to legal publishers and others.They provide sample documents, which even so cover only about half the possibilities in the full schema.
Users have options for various views of the legislation.
Their work leads to the concept of the web as a platform. Not just providing for users to consume, but also to reuse, aggregate, and combine.
Mixing data...hey, cookie dough!
The government's response to Web 2.0 trends. Government should enable information so that citizens can use the information. Doing so will lead not only to better public services, but also to other services, both commercial and noncommercial.
Problems include culture, rights, licensing, intellectual property, and technology challenges. Information becomes infrastructure and potentially as important as roads and other physical infrastructure. Legislation is widely cited content, which becomes infrastructure for other things. Legislation needs to be addressable with fragment identifiers, so that people can cite specific sections or paragraph rather than an entire act.
Why couldn't lawyers add editorial value to legislation in a wiki-type format. Not a job for the state, but something that could be enabled or inhibited by how the legislation is published. Providing addressable content and using standards would allow for third parties to use the legislation as a starting point for additional work.
They provide Atom (RSS) feeds for new legislation.
Library of Congress is an example of a re-user of UK legislation. UK legislation of interest for comparison purposes. They have a "PDF thing going on." Really wanted access to PDF versions of the information. Subscribe to the Atom feed, and the PDF will pop up there as a link.
Expect reuse for very granular areas...discussion of specific industries or topics. (If mad cow disease were to reoccur, expect footpaths to be closer, and a map could show in real-time what's open and what's closed.)
Providing sufficient flexibility into structure without descending into tag soup.
First question: Is the raw XML available to the public?
Yikes. The presenter hesitates and is quite uncomfortable. Seemed like a harmless enough question but apparently not. The answer is that it's available by subscription -- that is, lawyers pay to get access to it. They must balance between their economics and subscription income. They would like to publish XML; seems to be the direction that public policy is going. But "don't want to spend taxpayers' money to subsidize Lexis-Nexis."
Second question: Would these policies extend to others, like the Department for Transport?
Again, this sounds harmless to me, but appears to be quite controversial. Information produced as a core public task ("which is nowhere defined clearly") is public.
Really, when will government policy help the questioner push his employer into using structure? Interesting. I don't think we'd get that question in the U.S., other than in the negative.
Conferences here are so civilized, with the opening session at 10 a.m. Ahhhhh. Tea and cookies, er, biscuits at the breaks. Luvely.
Labels: conferences, web 2.0, xml, xpubs
7:39 AM Permalink | |

STC UK...almost live, part 2...Managing change
— posted by Sarah
Ant Davey, Rail Standards Safety Board (RSSB)Another excellent session. Ant provided a discussion of change management with quite a lot of references to more detailed resources.
Knowledge is being lost.
Information has value.
Web is changing search methods and expectations.
Web is changing ability to contribute and review content.
Findable information requires chunking.
Chunks are potentially reusable.
Ultimately, you have single sourcing.
Not "how we have always done it."
Chunking requires modular and collaborative writing.
Not "how we have always done it."
Single sourcing @ RSSB
* Still finding our way
* Technology led (in part)
* Starting to introduce standard templates
* Paving the way by communicating
* Planning an XML pilot
Change management
* Linking people and processes toward a desired change
* change is not where you are now
* need to know where you are going and tell those who you want to come with you
* "you are almost certainly under-communicating by a factor of at least 10 and possibly 100"
Carrying people with you. People view change as an attack on their current competence. Need to begin by celebrating what they have been doing right. 5-25% of people can't or won't be able to work with the new processes (Emma Hamer)
The change equation:
C = (ABD) > X
C = change
A = dissatisfaction with status quo
B = desirability of the new end state
D = risk and disruption to get there
X = cost of changing (effort, discomfort, difficulty, risk)
Carrying people with you
* celebration with is working
* explain what isn't and why
* describe how it will be with the new methods
* what's in it for them
* what's in it for the company
* what's in it for the clients
WIIFM = What's in it for me
* Active supporters
* Active dissenters
* Passive supporters
* Passive dissenters
Creating a change team
* You can't do all this by yourself.
* Special skills, talents, and leadership
* Where you can't carry, you may have to push or reallocate
First, Break All the Rules
Marcus Buckingham
Leadership is different from management. (That is SO true.)
Team members
* Champion or sponsor
* sustaining sponsor
* implementer
* change agent
* advocate
* group
* different styles, methods, and needs
* different personality types
* similar team gets quick results
* team with differences gets better result
Where change management goes wrong
* too much complacency
* lack of power in the guiding team
* not having real vision
* under-communicating (effectively)
* allowing obstacles to block the vision
* no short-term wins
* declaring victory too soon
* not embedding changes in practice
learning cycle
* concrete experience for activists
* reflective observation for reflectors
* theoretical concepts for theorist
* practical experimentation for pragmatists
(Experiential Learning, Kolb)
change
* needs leadership and vision
* needs good management
* needs metrics
* because ultimately it's about money
* increase revenue
* decrease costs
* make people's lives easier
* concentrate on the outcomes
* leave individuals to develop their own implementation plans
business process re-engineering
Customer led analysis method for business re-engineering
1. establish the scope
2. target the customer
3. model the process
4. analyze the structure
5. create the opportunity
6. redesign the process
7. refine the customer experience
8. ??
Why?
* customers dissatisfied
* position in value chain changes
* move from product to service or vice versa
* merger with another organization
* has to be customer-led
* good business case
* beware of targets (wrong targets lead to undesired behavior)
Influencing others
* getting results with authority
* you can't change other people
* you can change what you do, which may change how others react to you
* need to be politically savvy
Effective influence
* open
* honesty
* integrity
* loyalty
* rapport
* adult to adult communication and relationships
* maximal listening
* dovetailing needs outcomes
Strategies
* Logic
* Personal appeal
* Networking
* Bargaining
* Assertiveness
* Hierarchical appeal
Great presentation, and happy to see that my anecdotal experience has some amount of overlap with Ant's much more research-backed approach.
Labels: change management, stc2008, xml
6:51 AM Permalink | |

STC 2008: Wrap-up
Thursday, June 05, 2008 — posted by Sarah
Many thanks to those of you who stopped by the booth to meet us. We especially appreciate visitors who tell us that they read and enjoy our content, whether books, white papers, or this blog.
I had numerous requests for my paradigm shift presentation slides, so I am making them available here:
My next round of conferences will be in the UK. I'm leading an XSL workshop for STC UK on June 22 and giving a presentation on June 21 as part of the Trends in Technical Communication event. Then, it's onward to X-Pubs, where I'll be discussing the implications of Web 2.0 on technical communication.
As far as I know, after that I'm done with the conference circuit until the fall. However, senior technical consultant Simon Bate will be attending the Gilbane conference in San Francisco and participating on a DITA panel. Please contact us if you'd like to set up a meeting at the conference.
Labels: change management, conferences, stc2008, xml
1:44 PM Permalink | |

A Quarky new approach?
Tuesday, May 13, 2008 — posted by Sarah
Recently, Quark has announced their new dynamic publishing concept and/or solution.Where to start?
Although traditional publishing allows each author to hand-craft the appearance of each page, the limitation is that it ties information to the way it is presented. This means that if you want to publish the same information in print, Web, and electronic formats, then you have to create an entirely separate version of your information for each media type.Fascinating, but it sounds oddly familiar. Where could I have heard this before? Wait! This sounds like an argument for...single sourcing!
[S]ingle sourcing means writing information onceThat would be from The Impact of Single Sourcing and Technology by Ann Rockley, published in Technical Communication in 2001.
and using it many times. It does not mean writing it and
then copying and pasting it into another source, or modifying
the information for different needs such that you have
multiple sources.
The term "single sourcing" also appears in Designing Windows 95 Help: A Guide to Creating Online Documents, which was published in 1996 (!). You can see excerpts via Google Books. I'm sure there's more, but 1996 is plenty early.
Anyway, back to Quark:
Sorry, guys, but what you're describing is "single sourcing" and it's been around for a while. And I don't think redefining "dynamic publishing" is going to work, either, because that term already means something. Dynamic publishing can refer to the following:Dynamic publishing is a different way to create and share information. Dynamic publishing lets you create information as reusable components of information that you can easily combine for different uses - different types of documents and different audiences.
Dynamic publishing also automates the page formatting process, so you can automatically produce print, Web, and electronic content from a single source of information.
- Publishing on the fly: The information presented is based on the end user's requests and/or profile. Information is assembled when the user requests it (and not ahead of time).
- Customized publishing (or variable data publishing): The process of publishing content where the information varies but the overall organization stays the same. Financial statements are a good example of this type of publishing -- each customer needs their specific transactions on the page.
Arbortext. Hmmmm. There's something about Arbortext....
And here is where the situation gets truly weird. Take a look at the Quark executive biographies page. Of the ten people listed, five are ex-Arbortext, including the CEO, CIO, marketing VP, and two of three sales VPs.
So, Quark is the recipient of some sort of a multiple-organ management transplant from Arbortext. Given the rumors that the Arbortext-PTC merger hasn't been exactly a lovefest, the departure of senior management and others isn't surprising. It's their reappearance at a single company that's striking. And furthermore, it appears that they are trying to create Arbortext, MarComm Edition.
Will this work? The landscape is pretty bleak.
Here is an excerpt from Eric Kuhnen's analysis (published on TheContentWrangler.com, and you should read the entire thing):
Quark, in proposing to integrate a CMS into its Dynamic Publishing Solution, has just added a well known set of problems to their offering. There are literally dozens of CMS-enabled solutions on the market already; Quark’s entry is nothing new (well, it is to Quark but not to its customers). It’s not that adding the CMS itself is the wrong idea, but that incorporating a traditional CMS will yield fewer benefits to the customers in the markets it serves, and will not do much to displace the leading ECM vendors in the markets it would like to serve. So, Quark will follow the road it has always taken.(Emphasis mine)
A variation on this theme is found in an interview with Raymond Schiavone conducted by Pariah S. Burke, editor of QuarkVsInDesign.com (again, read it all, especially the analysis of the interview on the third and fourth pages). This excerpt is from Burke's analysis:
I think QuarkXPress will continue to have utility on its own, but its primary role will be to function as a desktop client for an as-yet unrevealed enterprise-grade suite of systems.The existence of InDesign Server notwithstanding, I think the overall analysis makes sense. Basically, transitioning Quark into a server-based publishing system requires moving away from freelancers and small business customers. They can't afford and don't need server-based publishing. Instead, Quark needs to make inroads into large companies with large marketing departments. And there, they run up against the twin buzzsaws of InDesign and existing competition in the content management space. This might work if Quark's offering was deeply compelling, unique, and game-changing. In its current version, it appears to be none of the above.
XPress 8 will be the first stage, I predict. [... Schiavone's] realistic goal for the XPress 8 generation of products will be to make the market take notice of Quark again, to open a dialog with large workflow managers who will help refine Schiavone’s vision for XPress 9.
By the time XPress 9 and its matching systems do release (probably less than 12 months following the release of version 8), QuarkXPress will be little more than a client application. All the real power will reside on the server-side systems. More importantly, by abandoning the so-called “feature war” with InDesign, Quark will create a lopsided conundrum for potential users—you can have near total automation of your publishing and production, with output to print, PDF, PDF/X, HTML, XML, and everything else you can think of, but without certain creativity, composition, and proofing features the competition will have had for generations.
The most difficult part of any change in technology is end user adoption. I've discussed change management on this blog and elsewhere. Bringing XML and automation into a marketing or publishing workflow is going to present some unique challenges.
In publishing (not technical publications), the deliverable is in fact the product. As a book publisher, you care greatly about the appearance of your final product, the book. In technical publishing, the appearance of the documentation is often negotiable, and making the inevitable compromises on formatting to get better automation is an acceptable tradeoff. This may not be true for most magazine and book publishers. (It's worth noting that the most technical of trade book publishers, O'Reilly Media, was also the first, as far as I know, to move to XML-based publishing.) Quark grudging acknowledges the challenge in the description of their solution:
"Cobbled together"?Dynamic publishing started in the realm of technical documentation, where large manufacturers and some types of publishers have implemented dynamic publishing to produce user guides, service manuals, parts catalogs, legal documentation, and similar types of information.
Some publishers have built their own dynamic publishing systems for publications that have more elaborate layout requirements than technical documentation, but these systems have been cobbled together from multiple technologies. In many cases, they have achieved some of their business goals but at the expense of far higher process costs.
"Pot? This is Kettle. How you doin'?"
Here is a description of what's in Quark's DPS (from the Quark DPS FAQ)
Quark Dynamic Publishing Solution (DPS) is publishing software. It consists of multiple software components, some from Quark and some from third parties, including:(Image from Quark's web site: http://dynamicpublishing.quark.com/dps/how_it_works.html)(emphasis mine)
- Optional desktop products for creating content: QuarkXPress, QuarkCopyDesk®, Xpress™ Author for Microsoft® Word, Adobe® InCopy® and InDesign®
- Standard server-based publishing software: QuarkXPress Server and Quark Transformation Engine, for publishing to print and electronic media
- Standard server-based product for automating workflow: Quark Publishing System
- Optional browser-based product for content creation, final document edits and reviewing
- Integration with server-based products for content management partners such as Alfresco®
Here is a really accurate bit of information. In response to the question, "How will dynamic publishing affect me and my employees?", we have this:
The primary impact is on the authoring process. Dynamic publishing shifts the authoring focus from hand-crafting pages to creating information that is independent of any specific media type, which means that authors stop worrying about how the information looks and instead focus on writing it. Authors also shift from creating monolithic documents to writing small, reusable components of information.There is a world of pain hidden in those three sentences. In my experience, the more creative technical writers have a more difficult time with XML than the more engineering-oriented writers. Let's graph from most technical to least technical:
engineers >> technical writers >> marketing writers
Uh-oh. Getting marketing people to follow structured authoring concepts is going to be really difficult.
A couple of final notes:
- The Quark-written content attempts to position this solution as the logical response to non-single source workflows. This is silly. I'd like to see a discussion of what makes Quark's approach to single sourcing better, faster, and/or cheaper than others.
- There's a discussion of return on investment, which includes this gem: "the return on investment can take from six to eighteen months." Indeed. It can also take forever. Not every organization will be able to show ROI for this solution, and claiming otherwise is ridiculous.
2:55 PM Permalink | |

DocTrain: Dynamic Publishing
Thursday, May 08, 2008 — posted by Sarah
Once Content is in XML. Now what?Learn How Dynamic Publishing Can Help You Improve the Re-use and Value of XML Content
Joshua Duhl
Quark
He begins with a lengthy explanation of why single-sourcing is a Good Thing, which I rather think might be unnecessary for this audience.
According to Mr. Duhl, most organizations are using print-based workflows or print-based workflows with an add-on for the web. Again, wrong audience.
The web mobile devices, and electronic communications have altered the fundamental principles of publishing: Content everywhere.Pitfalls of traditional publishing
- Processes are costsly
- Updates are slow
- Information is often out-of-date
- Content is prone to errors
- Customers are unhappy
- Deadlines are missed
Graphing complexity against volume
- high complex/low volume: tech doc
- high volume/low complex: statements, invoices
- in the middle: correspondence
- need to work with content from multiple sources
- publishing to multiple sources or for multiple sources
- enable content for re-use beyond the tech doc
- to use a single system that holds all information
- to have an automated workflow that ensures approved content is automatically published for each edition and different devices
Core principles
- content centric
- single source
- reuse strategy
Content is created regardless of format, layout, or media (content first)
single source
plan for reuse, support for variations and alternatives
leveraging XML
format versus structure
What is Quark Dynamic Publishing Solution
* QuarkXPress
* plus workflow
* dynamic publishing
OK, so I finally understand my issues with this...in a world where people are componentizing and picking and choosing their solutions, why would they go to a monolithic approach?
Create
* QXP
* Indesign
Word
XML
WEb
manage
workflow system/check-in/out etc
publish
QXP server
* Quark transformation engine
* XML transformation rules
delivery
rendered formats
Sorry the notes are so messy; this presentation went very fast due to some scheduling issues that were not the presenters fault.
But overall, Quark is proposing a "dynamic publishing solution" that enables single-sourcing workflows based on XML.
Labels: doctrainwest08, xml
1:34 PM Permalink | |

DocTrain: XML in the Wilderness
Wednesday, May 07, 2008 — posted by Sarah
Joe GollnerVice President
Stilo International
Likes to present "gory details on big projects gone wrong." I like him already.
The wilderness archetype is present in many different cultures. Going into the wilderness forces a person to change.
Next slide...the Patron Saint of Content Management! St. Jerome is officially the patron saint of libraries, librarians, archivists, and encyclopaedists.
And now, we're going to talk about what St. Jerome and XML have in common.
Oh, my goodness, his license plate reads: XML
Even better, his wife got it for him. I don't know either of them, but I predict a long and happy marriage.
And we're off to a cruise through the history of content processing. Some very cool information, but impossible to translate into a blog without his slides. (Check the DocTrain web site for slide decks; his are not posted at the moment.)
Now a discussion of SGML, what it achieved, and why it was hard for developers.
Here's an interesting bit about XML:
"The driving focus for XML has been facilitating a revolution in the way technology applications are designed, developed,and deployed."And critically, we're now talking about technology and XML, not content and XML.
And this has enabled the so-called Web 2.0. Joe is focusing on the fact that you can build very quickly and stay in "perpetual beta" in the "participatory web." People don't often talk about how XML-based technologies are what is making Web 2.0 possible.
What does XML mean for authors? Two contradictory challenges:
- Too much markup, which gets in the way of creating content, forces a reliance on unfamiliar tools, and adds a level of technical complexity to what is a creative task.
- Not enough markup...some content demands precision. Authors need clear guidance and useful feedback in order to satisfy this demand. As more content is delivered to applications, this is more common.
- Restrictions on syntax (XML took away some of the options that were in SGML to make it easier for computers to process.)
- Models mirror communication patterns less naturally than before
- New language (XML Schema) for declaring rules
- Schema modeling tools not helpful for content modeling
- XML is verbose
- Complexities reintroduced and application challenges remain
- Happy!
- Single sourcing
- Multiformat automatic publishing
He somewhat likes DITA, especially because it's an "assemblage of SGML Dirty Tricks." DITA gives us the ability to handle variability and change. DITA's approach is simple markup by default, but specialization allows for more specific markup.
XML has been in the (data) wilderness, but now it is finally returning home to where it should be (content). And DITA represents a serious effort in that direction.
St. Jerome went into the Syrian desert, learned Hebrew, and was able to create a new Latin translation of the bible (Vulgate). Likewise, XML has learned some things from life in the data world.
If you're looking for more coverage, Anne Gentle is sitting next to me with her laptop.
I also found Richard Hamilton, Antoine Giraud, and Scott Nesbitt. And someone writing Boarding the DocTrain.
Kudos to the DocTrain team for picking a lovely city and hotel. And for providing wireless coverage in the ballrooms!
Labels: doctrainwest08, xml
12:37 PM Permalink | |

WritersUA: DITA pilot techniques
Wednesday, March 19, 2008 — posted by Sarah
Mark Wallis of IBM ISS on how to run a successful DITA pilot. Some great information in this presentation on how to reduce risks.He recommends selecting your pilot project based on the following items:
- Right timeframe -- don't choose the project that has an imminent release
- Choose a manageable documentation set size
- Reduce risk by avoiding the strongest (or most critical) product
- Identify a product with a known need to improve the user experience
The ideal team for a pilot will need cross-functional and complementary skills:
- Project management skills
- Tools and technology strengths
- Product knowledge and understanding
- Architecture and design skills
- Editor for standards and styles
- No autopilot writing
- Don't just migrate existing content; you'll get trapped in old paradigms (this assumes that existing content does not fit the DITA topic paradigm)
- Perform use case analysis and task analysis
- Determine the critical scenarios to document
- Focus on tasks; backfill supporting information as needed
They set up a DITA War Room in a small conference room and met at least daily (1.5 to 2 hours per day. Yikes). They set weekly goals and used small tasks to build momentum.
There was also heavy use of an internal wiki to put up initial "straw man" design, then revise, comment, and discuss.
Layering deliverables
Implementation deliverables were split out into smaller tasks, such as:
- Creating topic files, links, and navigation
- Testing links from code and navigation
- Creating task and reference topics
- Validating help against the user interface
- Creating concept topics for principles, guidelines, and best practices ("deep concept")
- Validating content in the expert community
Choosing the DITA toolset
Task Modeler (free) for building and managing ditamaps, defining relationships between topics, and creating skeleton topics (stub files).
DITA-compliant editor to edit your topics.
Compiler (part of open source toolkit). Compiler? What are they compiling? HTML Help? Oh. He just referred to Ant as a compiler. Ohhhhhkay.
Proof of concept
They picked a subset of the pilot to do the proof of concept.
The presenter's boss is quoted as saying, "There's no such thing as bad weather, only insufficient clothing." I'm guessing that she's never been to Minnesota in winter.
The objectives for the proof of concept:
- Learn and evaluate tools
- Address technical obstacles
- Specify end-to-end requirements
Managing costs
Purchase toolsets only for pilot team.
After completing proof of concept (successfully!), invest in tools for the remaining writers.
Wiki
They used their wiki to capture conventions and guidelines.
Improving acceptance
They paid attention to the change management issues. He doesn't mention it here, but I would assume that the combination of an acquisition by IBM plus the requirement to change the authoring environment could have caused significant angst. Their approach included presentations, wiki content, email discussions, and online training.
At the point of transition, DITA boot camp was offered.
They used collaborative walkthroughs, or reviews, to help standardize their content development. Interesting. This sounds as though it could be a) threatening and b) an unbelievable time sink. But just maybe it might also c) help improve the content.
Other lessons learned
Think more, write less. (Don't document the obvious, don't document common user interface convention, write only if you're really adding value.)
Don't squander your ignorance. (If something makes you stumble in the interface, that will probably also cause problems for your users, so capture it.)
The more structured your content, the easier the transition to DITA.
Documenting the obvious teaches readers to ignore your text, so don't document the obvious.
The handouts are available here: http://www.writersua.com/ohc/suppmatl/
Labels: change management, dita, writersua2008, xml
5:29 PM Permalink | |

WritersUA: Day 3, Morning
— posted by Sarah
Dave Gash (hypertrain.com) leads off the festivities with a discussion of the UA Holy Grail. And no, it's not DITA.He is discussing True Separation of Content, Structure, Format, and Behavior.
Interesting, because we normally hear about separation of content and presentation -- he's making finer distinctions.
According to Dave, the current authoring method is to using WYSIWYG and code editors, often in combination. And as we work, we insert what's needed wherever it's needed. The result is that documents work -- once -- but are very difficult or impossible to update, maintain, and control.
Spaghetti-code documents make our own jobs harder.
The conventional wisdom is to separate content and formatting. Content is "stuff on the page"; therefore format must be "everything that is not content."
Content could include HTML, CSS, and JavaScript. Separating out CSS still leaves "junk" in the content pages.
Dave proposes a more refined model: content, structure, formatting, and behavior.
* Content is XML
* Structure is XSLT
* Format is CSS
* Behavior is JavaScript (JS)
This will be more maintainable, which means:
* Ability to change any components without breaking the others
* Ability to reuse any component in other pages or projects
* Ability to control each component's resource allocation (that is, who creates each piece?)
How to improve your pages:
1. Identify and externalize JS behavior.
* Find the embedded scripts (<script> tags) and remove them with a reference to an external foo.js file.
<script language="javascript" src="foo.js"></script>
2. Identify JS behavior that could be CSS and convert it to CSS rules.
"If you can encode with CSS and make it declarative instead of procedural, you're way ahead of the game."
* Catch "sneaky" JavaScript behavior, such as mouseover events, that could be CSS rather than JavaScript. Event handlers that call JavaScript almost always start with "on" -- easy to identify and many can be replaced with CSS hover pseudoclasses.
.expterm:hover {font-style:italic; }
.expterm {text-decoration:none;}
Removing the code from the HTML greatly simplifies the page.
3. Identify and externalize CSS styles, recode any local formatting as classes.
Get rid of "deprecated tags and doo-doo like that."
Get rid of style attributes, font tags, b tags (become span tags).
"It's said that comments are for someone who comes behind you six months later and needs to update your code. This is not true. Comments are so that YOU can figure out six months later what you were doing in the code."
So you should comment your code.
4. Semantically mark up content as XML.
Dave's definition of semantic markup? "call things what they are."
5. Identify desired HTML output structure, write XSL transforms to produce it.
So...what's in it for me?
Discrete, maintainable, controllable components
* you can change one component without breaking others
* You can share components with other pages
* You can separate work load by skill sets
* Set it and forget it! (for everything except the content)
Code examples are available at Dave's web site: www.hypertrain.com
Questions about tools. No, he won't recommend tools. Question about schemas...Dave says the first thing that comes to mind is...DocBook???
Yikes. In an answer to a question about print and XSL-FO, somebody recommended asking....me! (I swear I didn't pay her for that, and I don't think she even knew I was in the room. Quite surreal.)
##
My only disagreement with this session is with the separation of XML as "content" and XSLT as "structure." It's my opinion that the XML includes the structure, and XSLT just gives me a way to express that structure into HTML or other formats.
I also question some of his tag names, such as <expander> for a term/definition group. The expander tag name is really a description of the desired behavior (expandable text) rather than the semantic function of the content (definition of a term). I would probably choose something like <glossaryitem> for the container, leaving opening the option of changing the behavior to something other than expansion in the future. Same quibble with <ddblock> (drop-down block).
I do like the use of the
Great presentation from an energetic presenter whose motto is, "If I have to be awake, you do, too!"
Side note: I'm pretty sure that if you tied Dave's hands behind his back, he would lose his ability to speak.
Labels: presentations, writersua2008, xml, xsl
1:16 PM Permalink | |

XFL: He Hate Me Not
Saturday, March 08, 2008 — posted by Sarah
(For those of you with a life, the title is a reference to this.)According to Colin Moock, the next version of Flash will have an XML-based format. He writes:
Flash CS4 will be able to export *and* import a new source format called XFL. An XFL file is a .zip file that contains the source material for a Flash document. Within the .zip file resides an XML file describing the structure of the document and a folder with the document's assets (graphics, sounds, etc). The exact details of the XFL format are not yet available, but Richard [Galvan, Flash authoring-product manager] assures me that Adobe intends to document them publicly, allowing third-party tools to import and export XFL.This is important. Currently, it's fairly impossible to integrate Flash and non-Flash content. Other than, of course, with our 80s friend, Mr. Cut-and-Paste.
If Flash speaks XML, we can develop a process along these lines:
And that has major implications for development of e-learning content and other things that you might expect to find in Flash. At some point, when it's not five minutes before the Duke-Carolina game, I'll try to be more specific.(h/t John Nack)
PS "Carolina Goodnight"? I don't think so. See note 4.
8:13 PM Permalink | |

"Once you start down the DITA path, forever will it dominate your destiny"
Thursday, January 03, 2008 — posted by Sarah
Eliot Kimber has a lovely article on using DITA for narrative documents. I'm trundling through it, nodding in agreement, and then we have this horror:[...] DITA offers at least two compelling advantages over any other candidate XML application:Now, he does qualify this statement by saying that these assertions apply only if DITA is a reasonable fit for your problem. But the overall thrust of the argument appears to be that since DITA can do narrative documents (which it was emphatically not designed for), it can potentially be applied to an enormous new set of content.
- The initial cost of ownership is low, approaching zero, and the ongoing cost of ownership is low.
- It offers a number of sophisticated features in terms of modularity, extensibility, and linking that either are not provided by other applications or would cost a prohibitively large amount to build from scratch.
That is, the cost of applying DITA is almost always going to be significantly lower than the cost of any alternative (and at worst will be no more expensive than any other alternative).
Ugh.
Before I begin today's DITA-bashing session, I need to point out that we are currently using DITA for several projects here at Scriptorium. DITA slices! DITA dices! DITA advocacy raises your IQ, improves your health, and makes you irresistible. I like DITA just fine.
Moving right along...
"1. The initial cost of ownership is low, approaching zero, and the ongoing cost of ownership is low."
Just because it's free doesn't mean it's cheap. The default output from the DITA Open Toolkit ranges somewhere between unattractive (HTML) and fugly (PDF). If you care about the appearance of your final documents, you are going to have to do a lot of work to get the look and feel you want. And although the OT offers a starting point, customizing it is kind of like a trip to the dentist. The impacted-wisdom-tooth-removing kind of trip.
Getting your output working properly is Not Easy because of the, er, unique design of the OT. If the set of tags you need is small, you might be better off building a nice petite NovelML and then writing the transformations you need for NovelML instead of wrestling with DITA's complexities.
"2. It offers a number of sophisticated features in terms of modularity, extensibility, and linking that either are not provided by other applications or would cost a prohibitively large amount to build from scratch."
I agree that DITA has some lovely features in this area. However, I fail to see how they apply to the example at hand -- a narrative document such as Moby Dick. If you need modularity, extensibility, and linking features, you should consider DITA. If you don't, then these features will just get in the way.
That is, the cost of applying DITA is almost always going to be significantly lower than the cost of any alternative (and at worst will be no more expensive than any other alternative).If DITA is overkill for your requirements, then applying DITA may not be cheaper.
But the issue that upsets me the most is this: when you attack a problem by assuming (or hoping) that DITA will work, you necessarily look for DITA features you can use. You may not think carefully about non-DITA features that you might like to have. For fiction content, I can think of several things that would be quite useful (and for which DITA offers no immediate support):
- For a book that is part of a series (like a science fiction trilogy), a listing of the entire series and an indication of where the current book falls in the series.
- Metadata to identify the point of view. Many novels switch from one narrator to another, or from a first-person point of view to an omniscient point of view. It would be lovely to filter the content to see only the first-person content (after reading the book from cover to cover as the author intended).
- Similarly, metadata that helps with scene location and time could be invaluable for studying literature written with numerous flashbacks. The Time Traveler's Wife and anything by Jasper Fforde come to mind.
- The ability to index by character occurrence. This is more often seen in nonfiction books, especially biographies. But imagine scanning the entire Harry Potter series for scenes with Severus Snape to determine whether his ultimate allegiance was consistent.
As Eliot says, the advantages of DITA can be significant. But I fear that a generation of documents will be crammed into DITA, resulting in documents that are not as well structured as they need to be.
I will now await my smackdown from the DITA Disciples.
Signed,
DITA Dissident
9:44 PM Permalink | |

Crickets
Tuesday, October 30, 2007 — posted by Sarah
It's been a busy couple of months and my blogging has suffered accordingly.However, I do have a new article available in STC's magazine, Intercom. I will be writing a regular column entitled "The XML Strategist."
The first installment is "When is XML the Wrong Answer?" If are you are an STC member, you should be able to read the article online as a PDF here. Here is a short excerpt:
XML offers some interesting features, but are they of value to your workflow? If you are happy with your current authoring and publishing system, and nothing is compelling you to move to XML, why make the effort? The XML tools are not as mature as “traditional” desktop publishing tools. Over time, the cost of implementing
an XML-based workflow will drop, and your business case will look more attractive.
I welcome comments and ideas for future columns.
Labels: xml
2:31 PM Permalink | |

Reactions to the TechComm Suite
Wednesday, September 26, 2007 — posted by Sarah
Bloggers are starting to comment on the TC Suite. Here are a few I spotted this morning:Bill Swallow ("TechCommDood") writes on waxing techcomm:
I'll admit, I'm both impressed at the package (the monetary deal for the payload of technology is quite appealing) and at Adobe's direct acknowledgement of the techcomm market. [...]This is an important point (and a highly problematic one). If you link your FrameMaker content into a RoboHelp project and then make changes to the FrameMaker-sourced content in RoboHelp, then you end up with two copies of the content. Not good, and the temptation to just "tweak a few things" is always there. (I'd be happy to be proven wrong on this point.)
The workflow is still unidirectional; FrameMaker to RoboHelp to online output. There is no going back from RoboHelp should you make changes (which you can, since RoboHelp also remains an authoring tool) once you import the FrameMaker content.
This is where the similarities between RoboHelp and the likes of WebWorks Publisher and Mif2Go end. RoboHelp allows you the option to continue to edit content in the built-in (or external) HTML editor after import.
Bob Doyle writes on his techwr-l.com blog:
You can include Help in FrameMaker projects, eLearning in RoboHelp and in Frame, 3D animations in Help and Frame and in PDF documents, RoboHelp screen captures from Frame, etc, etc. All the tools include direct access to aspects of the others from within the tool. You do not have to leave one tool to “Edit with…” another tool. And no longer are conversions needed to reuse assets.This is the first reference I've seen to reusing RoboHelp content in FrameMaker. I don't believe that this is actually possible.
Another positive initial review from Ron Miller:
[...] Adobe appears to have taken care to put integration on the front burner to make it easier for training and tech writing departments to share content.Dan Ortega of Astoria (via Charles Jeter) clearly identifies the strategic problem with the Suite:[... T]hey appear to have answered all the criticisms I had of RH6 and then some with RH 7. What's more they have integrated it with Frame to create a fully featured publishing environment.
Until I take it through its paces with a project, it's hard to judge but the first impressions were good and it appears clear that Adobe wants to claim a place in the tech writing market.
Adobe's products are evolving and becoming more integrated, but they are doing so inside the Adobe walls. Conveniently, FrameMaker and RoboHelp are now neighbor, where before they were more like rival gangs with a turf war. But the XML and XSL barbarians are at the gates, and it's time to let them in and accept them as citizens. (This metaphor has clearly run, er, amok.)[...] Adobe still appears to be focused on a desktop paradigm. [... W]hen they reference workflow, they refer to workflow integration between the products in the TC Suite. [...]
If Adobe plans to succeed in the enterprise, they have to take a much broader view of how technical documentation teams work by moving beyond the creation perspective. They need to adopt a perspective that encompasses the entire production cycle[...].
The era of proprietary content files is over. Baseline content needs to be in XML because of the "production cycle" that Mr. Ortega describes. XML is:
- Supported by content management systems
- Advantageous for localization workflows
- Enforceable (that is, you can enforce your preferred structure)
- An excellent starting point for automated content production (via XSL, FrameMaker, or even InDesign)
Labels: FrameMaker, robohelp, TechComm Suite, xml
10:45 AM Permalink | |

Inside our XML workshop...
Wednesday, July 11, 2007 — posted by Sarah
Leanne Rollins of the Southwestern Ontario STC posted a summary of the workshop I did in March. It gives you an excellent flavor of what these workshops are Really Like.I particularly enjoyed this bit:
The planning requirements and cost implementation alone were enough to scare the entire the room into reassessing their *actual* authoring and publishing needs.I call that success.
Labels: change management, xml
9:28 AM Permalink | |

When you have a hammer...
— posted by Sarah
...everything looks like a nail.We all suffer from this syndrome to a certain extent. Once you develop familiarity with a particular tool or technology, you see possibilities everywhere.
Sean McGrath refers to this as the Just Use X Club. He is none too happy with the rising membership in Just Use X and proposes a counter-organization:
A second club needs to be formed called "When not to Just Use X" club. This group should devote itself to taking all values of X from the "just use X" club and listing off all the scenarios in which each X should not be used. They should also list off all of the things which will not automatically be true by virtue of the use of X. They should also list off all the areas where good old fashioned thought and design and hard work cannot be replaced by the simple gambit of using X.This fall's Intercom will launch my new column, tentatively titled The XML Strategist. The first installment is devoted to scenarios in which XML is not appropriate.
It would appear that Mr. McGrath and I are kindred spirits.
Labels: change management, xml
8:42 AM Permalink | |

Understanding change resistance
Wednesday, July 04, 2007 — posted by Sarah
Implementing new technology presents numerous challenges -- choosing new software, training staff on new technology and processes, setting up new workflows, and so on. For technical writers, the transition from traditional desktop publishing to XML-based workflows requires a significant shift in mindset. Instead of focusing on the appearance of the final deliverable (usually on paper), writers must now give up control over formatting, follow a set of structure rules, and assume that the end result will be formatted automatically.You should not underestimate the difficulty that this transition presents. With that, I was disappointed to see the following at Accelerated Authoring:
If Pete decides to go for DITA, he’ll have to [...] persuade management, get a budget, train writers and figure out how to manage the transition. Not easy. And, if the transition is not smooth, Pete could be penalized.No.
On the other hand, Pete could get through the transition period to DITA and leverage the same team that he had yesterday to produce more documents, more focused documents, better documents. Is there risk in the transition? Of course, but that’s what life is about - adapt or disappear.
"Pete" must first determine that the benefits of XML-based authoring outweigh the costs. Then, Pete needs to think about whether DITA is the right choice for his organization's content.
DITA is not right for everyone. XML is not right for everyone.
Keep in mind that the benefits of XML generally go to management and the difficulties (worse tools, less control, more constrained authoring) are imposed on authors.
If you're interested in more details, the slides from my Coping with the XML Paradigm Shift presentation are available here (PDF).
Labels: change management, xml
11:24 AM Permalink | |

For once, we're not the laggards
Thursday, June 14, 2007 — posted by Sarah
Over on the Really Strategies blog, there's a discussion about assumptions and best practices in "print composition." The company is somewhat similar to ours, except that they are focused on the publishing industry (magazines and books) rather than technical content.The post basically lays out the argument that content should use consistent styles and file naming conventions, partly because it's A Good Idea, but also because this helps lay a foundation for an XML workflow.
I like to assess the "technical quality" of files using four levels:
- No consistency. A document created by Author A looks nothing like a document created by Author B.
- Visual consistency. A and B's documents look the same on paper (or whatever the final deliverable format is), but the implementation in the files is inconsistent.
- Template-driven authoring. Information is consistent on paper (orwtfdfi) and formatting is implemented consistently with paragraph, character, and other styles. It's easy to reproduce the correct look and feel by applying the appropriate styles.
- Structured authoring. Information follows the required structure, and formatting is driven by the hierarchical relationship of the various elements.
Why the difference? I believe it is because the publishing industry is still focused on print as a primary deliverable (this makes sense when you're selling magazines!) and is unwilling to compromise look and feel to gain automation. In our industry, that is not usually the case.
Labels: xml
3:54 PM Permalink | |



