Random thoughts about publishing

icon Site Feed

Labels

Palimpsest has moved. Please visit our blog in its new location for the most recent posts from Scriptorium.

Palimpsest

 

tekom report

Wednesday, November 12, 2008 — posted by Sarah O'Keefe

I hope that the cognitive impairment resulting from jet lag has dissipated enough to write this post.

Last week, I attended tekom/tcworld. With approximately 2200 attendees, plus 1200 trade show-only visitors, this is the largest gathering of technical communicators in the world. Over 180 vendors were at the trade show, along with some fairly impressive accessories.

Things you don't expect to see at a tech writing conference

I am sorry to tell you that the chocolate fountain people showed up with a juice bar this year.

Wednesday morning started off on a fun note, as several people stopped by to congratulate us on the election. The European population is at least as interested in this year's U.S. election as we were. (Side note: During an extended beer-and-sausage dinner at the Ratskeller, a group of us were sitting behind a group of German bikers. They were in full biker regalia, with patches for something like "Rolling Thunder Wiesbaden." Lots of beards, beards with braids, long hair, and leather. In fact, other than the German language, they would have fit right in at Myrtle Beach during Bike Week. So, at one point, their table got loud(er), and we looked over to see them crashing their beer mugs together yelling, "OBAMA! OBAMA!")

We had an opportunity to catch up with conference buddies and fellow consultants. Tony Self's description of Australia's fierce killer magpies was especially entertaining. I'm afraid I didn't quite believe the story at first, but wikipedia says it's true. (Bike helmets with fake eyes on the back to fool the magpies into attacking the back of your head instead of pecking out your real eyes!)

On a work-related note, I delivered two sessions, one on XSL and one on Web 2.0. If you're interested in a (very) basic introduction to XSL, the content of the XSL workshop is now available. You'll need the instructions (PDF, 1.1MB), the XML sample file, and the CSS file for formatting. The workshop is based on information from our three-day XSL class, which is obviously far more detailed.

The Web 2.0 presentation, in Flash format, is available below:





Notes: Use the arrow keys to navigate through the slides. The first slide may take a few seconds to come up; the presentation file is quite large. If you prefer a narrative white paper version, we have one here.

A few final thoughts about the conference:

Labels: , , ,


12:47 PM Permalink | |

divider

 

An incomplete puzzle: DITA OT stylesheets

Friday, September 05, 2008 — posted by Simon Bate

A recent post on the dita-users Yahoo group asked how to customize the DITA OT stylesheets in view of the fact that there isn't much documentation available.

From my work customizing and otherwise perverting the DITA OT, I can sympathize with these frustrations. When I started investigating OT customizations, I found many well-crafted tutorials on how to customize and specialize the OT. These were a great starting point, but they only got me so far. In its current state, the documentation is an incomplete jigsaw puzzle; the trees and buildings are filled in nicely, but the sky is still waiting for someone with patience. (Block that metaphor!)

Because there is no documentation available at the individual template level, you need to reconsider the task at hand. I look on it as debugging, decoding, or sleuthing. With that in mind, I find the following to be very useful:
Probably the best form of documentation that the OT could provide here is additional comments in the stylesheets, particularly about the order of processing.  I find I add many comments about where to find the template that handles nodes from an  <xsl:apply-templates> directive.

One further note. On Tuesday, September 23, I'll be presenting the third of our "Best Practices in Structured Authoring and Publishing" joint Webinar series with JustSystems. In this presentation I'll describe a number of approaches you can use to customize DITA OT output. For more information, visit the JustSystems web site.

Labels: , ,


10:12 AM Permalink | |

divider

 

XPubs: XSL-FO for Documentation Formatting

Monday, June 23, 2008 — posted by Sarah

Mike Miller, Antenna House

For starters, XSL-FO is an XML standard.

XSL-FO is "a pagination markup language describing a rendering vocabulary capturing the semantics of formatting information for paginated presentation." (Ken Holman)

Or, as I like to say, "A document layout described in a text file."

XSL-FO is black box formatting. Can't go back and "tweak" the files to fix them. With FO, you're typically talking about a minimum of a couple hundred pages. Much faster to render automatically rather than by hand in InDesign or FrameMaker.

First commercial products in 2001 from Antenna House and RenderX. Also, open source FOP from Apache in 2001. FO successful in the sense that both commercial companies are doing quite well.

FO more successful than any other technical publishing application other than perhaps TeX and FrameMaker. Probably attributable to the availability of open source (free) and trial versions from commercial vendors (free).

XSL-FO is only concerned with visual display of XML data, which means that the FO file has no semantic content, only formatting instructions.

The FO stylesheet specifies:
Advantages:
Antenna House has been personally involved in about 30 different DITA projects.

Most business documents can be formatted automatically as FO. Rule of thumb: "If it's XML, FO can be applied."

Other applications for FO might include faxes, German railway tickets, correspondence from financial institutions and government.

Typesetting is very complex with issues like widows and orphans and hyphenation. Software can handle this. Human typesetters have been removed from the process, and this shows in amateurish mistakes. But you can use FO to configure something that follows typography rules and give you a professional look and feel.

"Overwhelming benefits" of using FO. Which begs the question: "Why aren't more people using it?" A slide with the benefits of XML showing The Usual (cost, time-to-market, less redundancy, standards-based, localization for cost justification, etc.).

People who use FO: auto manufacturers, cell phone manufacturers, banks, aerospace, government, military, educational

FO not appropriate for documents that are "artistically created."

FO extensions provide support for:
Thus, if you need one of these features, you might get somewhat locked into your rendering engine...the extensions are specific to a particular FO engine.

DITA Open Toolkit reduces complexity of getting set up and produce PDF. Could be configured and producing PDF in "a couple of hours." (Perhaps, but making it look the way you want is going to take a while.) According to Mike, somewhere between a few days and a few months, depending on the complexity of your requirements.

PDF output from DITA
Stages:
Several software components are required -- DITA Open Toolkit provides all the components you need.

Why not FrameMaker or InDesign?
You need WYSIWYG if:
If you need WYSIWYG, you need a layout engine like FrameMaker or InDesign. If you need WYDSIWYN, you need XSL-FO.

On the low end, FO is free with FOP. Antenna House is most expensive at $1250 for stand-alone or server license for $5,000.

FO supports more languages than any other solution currently available.

Solving the real problem:
XSL-FO is delivering on the XML promise. Don't underestimate it.

First question: Flowing text into typesetting engine results in line breaks that will cause readers difficulty. And this annoys him (as a professional typesetter). We want powerful, automated formatting AND the ability to do WYSIWYG tweaks. Thinks there is a role for a WYSIWYG stage after the automation bit.

I've noticed this on the BBC, too. British people ask really pointed questions.

And in response, Mike says that Antenna House has a solution for this where you create INX (InDesign XML) content (4 minutes) and then you can pull it into InDesign (half an hour), and do some cleanup.

Do all the XSL-FO tools cover 100% of the FO standard? "No, definitely not."

Labels: , , , ,


7:50 AM Permalink | |

divider

 

WritersUA: Day 3, Morning

Wednesday, March 19, 2008 — posted by Sarah

Dave Gash (hypertrain.com) leads off the festivities with a discussion of the UA Holy Grail. And no, it's not DITA.

He is discussing True Separation of Content, Structure, Format, and Behavior.

Interesting, because we normally hear about separation of content and presentation -- he's making finer distinctions.

According to Dave, the current authoring method is to using WYSIWYG and code editors, often in combination. And as we work, we insert what's needed wherever it's needed. The result is that documents work -- once -- but are very difficult or impossible to update, maintain, and control.

Spaghetti-code documents make our own jobs harder.

The conventional wisdom is to separate content and formatting. Content is "stuff on the page"; therefore format must be "everything that is not content."

Content could include HTML, CSS, and JavaScript. Separating out CSS still leaves "junk" in the content pages.

Dave proposes a more refined model: content, structure, formatting, and behavior.

* Content is XML
* Structure is XSLT
* Format is CSS
* Behavior is JavaScript (JS)

This will be more maintainable, which means:

* Ability to change any components without breaking the others
* Ability to reuse any component in other pages or projects
* Ability to control each component's resource allocation (that is, who creates each piece?)

How to improve your pages:

1. Identify and externalize JS behavior.

* Find the embedded scripts (<script> tags) and remove them with a reference to an external foo.js file.

<script language="javascript" src="foo.js"></script>

2. Identify JS behavior that could be CSS and convert it to CSS rules.

"If you can encode with CSS and make it declarative instead of procedural, you're way ahead of the game."

* Catch "sneaky" JavaScript behavior, such as mouseover events, that could be CSS rather than JavaScript. Event handlers that call JavaScript almost always start with "on" -- easy to identify and many can be replaced with CSS hover pseudoclasses.

.expterm:hover {font-style:italic; }
.expterm {text-decoration:none;}

Removing the code from the HTML greatly simplifies the page.

3. Identify and externalize CSS styles, recode any local formatting as classes.

Get rid of "deprecated tags and doo-doo like that."

Get rid of style attributes, font tags, b tags (become span tags).

"It's said that comments are for someone who comes behind you six months later and needs to update your code. This is not true. Comments are so that YOU can figure out six months later what you were doing in the code."

So you should comment your code.

4. Semantically mark up content as XML.

Dave's definition of semantic markup? "call things what they are."

5. Identify desired HTML output structure, write XSL transforms to produce it.

So...what's in it for me?

Discrete, maintainable, controllable components
* you can change one component without breaking others
* You can share components with other pages
* You can separate work load by skill sets
* Set it and forget it! (for everything except the content)

Code examples are available at Dave's web site: www.hypertrain.com

Questions about tools. No, he won't recommend tools. Question about schemas...Dave says the first thing that comes to mind is...DocBook???

Yikes. In an answer to a question about print and XSL-FO, somebody recommended asking....me! (I swear I didn't pay her for that, and I don't think she even knew I was in the room. Quite surreal.)

##

My only disagreement with this session is with the separation of XML as "content" and XSLT as "structure." It's my opinion that the XML includes the structure, and XSLT just gives me a way to express that structure into HTML or other formats.

I also question some of his tag names, such as <expander> for a term/definition group. The expander tag name is really a description of the desired behavior (expandable text) rather than the semantic function of the content (definition of a term). I would probably choose something like <glossaryitem> for the container, leaving opening the option of changing the behavior to something other than expansion in the future. Same quibble with <ddblock> (drop-down block).

I do like the use of the tag name for step results.

Great presentation from an energetic presenter whose motto is, "If I have to be awake, you do, too!"

Side note: I'm pretty sure that if you tied Dave's hands behind his back, he would lose his ability to speak.

Labels: , , ,


1:16 PM Permalink | |

divider

 

Writing better XSL

Tuesday, May 01, 2007 — posted by Sarah

Jeni Tennison has a new blog. Her latest post has tips on when to use template matching, named templates, and for-each statements.

In my experience, most people who are new to XSL overuse for-each loops, because they most closely resemble familiar programming constructs.

Labels:


9:04 PM Permalink | |

divider


Scriptorium Publishing | Post Office Box 12761 Research Triangle Park, NC 27709 | (919) 481 2701 | info@scriptorium.com