White paper on whitespace (and removing it)

Simon Bate / Tools2 Comments

When I first started importing DITA and other XML files into structured FrameMaker, I was surprised by the excessive whitespace that appeared in the files. Even more surprising (in FrameMaker 8.0) were the red comments displayed via the EDD that said that some whitespace was invalid (these no longer appear in FrameMaker 9).

The whitespace was visible because of an odd decision by Adobe to handle all XML whitespace as if it were significant. (XML divides the world into significant and insignificant whitespace; most XML tools treat whitespace as insignficant except where necessary…think <codeblock> elements). This approach to whitespace exists in both FrameMaker and InDesign.

At first I handled the whitespace on a case-by-case basis, removing it by hand or through regular expressions. Eventually, I realized this was a more serious problem and created an XSL transform to eliminate the white space as a part of preprocessing. By using XSL that was acceptable to Xalan (not that hard), the transform can be integrated into a FrameMaker structured application.

I figured this whitespace problem must be affecting (and frustrating) more than a few of you out there,
so I made the stylesheet available on the Scriptorium web site. I also wrote a white paper “Removing XML whitespace in structured FrameMaker documents” that describes describes the XSL that went into the stylesheet and how to integrate it with your FrameMaker structured applications.

The white paper is available on the Scriptorium web site. Information about how to download the stylesheet is in the white paper.

If the stylesheet and whitepaper are useful to you, let us know!

About the Author

Simon Bate

Twitter

Involved in TechComm all my working life (since the time of vacuum tubes, core memory, punch cards, and bone implements). I've worked as a writer, a manager, and—for the past score of years—building software tools for TechComm. My motto is "Let the computer do the work." Outside of work, I balance the calories I create and consume in the kitchen with weight-training sessions at the gym. I also sing Tenor in various choirs and choruses.

2 Comments on “White paper on whitespace (and removing it)”

  1. The stylesheet does not seem to work with Framemaker. Once I install it with my structured application, whenever I try to open a file, it tells me: Error at file c:docume~1usernameLOCALS~1tempFMT689A.tmp, line2 char 67, Message: could not open DTD file: c:DOCUME~1userLOCALS~1Temptopic.dtd —- Without it, I have no problems opening files…. It’s too bad, because I need to remove whitespaces created by Xmetal…..

  2. Hi Carla,

    I’d have to see the file you’re trying to open in FrameMaker, but offhand it looks like there’s a problem with the existing DOCTYPE declaration in your files. The resolver is failing to find the public DTD via the catalog, and is falling back on the system DTD, but that isn’t available either.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.