Introduction
Many XML editors add indentation and carriage returns to XML files so that they’re easier to read (often called pretty printing). For the most part, this extra whitespace is fine because XML tools normally ignore it. For example:
<section>
<title>Match processing instruction
nodes</title>
<p>As mentioned earlier, FrameMaker
occasionally reacts badly to
non-FrameMaker processing
instructions. To forestall any problems, the template that
handles
processing instructions ignores all processing instructions except
those from
FrameMaker.</p>
<p>Using
the
<cmdname><xsl:choose></cmdname>statement
rather
than
<cmdname><xsl:if></cmdname>,
so that it
would be easier to modify the transform in
the future, should we want to handle
additional processing
instructions.</p>
Unfortunately, FrameMaker does not ignore whitespace when it imports XML documents. The whitespace that made the XML easier to read becomes extra spaces within text and between lines.

Here’s how the document looks in the structure view. In some cases, these extra <WHITESPACE> nodes cause FrameMaker 9 to crash when generating PDF files:

Because extra whitespace is found in so many XML files, I created an XSL transform to remove the whitespace before reading files into FrameMaker. You can run this transform separately or you can integrate it into a FrameMaker structured application. Here is the same block of text with no whitespace (actually the text is all on one line, with no line breaks).
...<section><title>Match processing instruction nodes</title><p>As
mentioned earlier, FrameMaker occasionally reacts badly to non-FrameMaker
processing instructions. To forestall any problems, the template that
handles processing instructions ignores all processing instructions
except those from FrameMaker.</p><p>Using the <cmdname><xsl:choose></cmdname>statement
rather than <cmdname><xsl:if></cmdname>, so
that it would be easier to modify the transform in the future, should
we want to handle additional processing instructions.</p>...
Of course, there are some elements that preserve whitespace, such as the DITA <pre> or <codeblock> elements. The transform is aware of these and deals with them correctly.
In addition to handling whitespace, the transform also deletes all non-FrameMaker processing instructions. Although FrameMaker is supposed to ignore all non-FrameMaker processing instructions, some processing instructions cause FrameMaker to fail on reading XML files.
The entire stylesheet is presented at the end of this document and is available for download from Scriptorium’s web site.
| NOTE: | This white paper assumes a basic understanding of XSL. However, it is possible to use the stylesheet without knowledge of XSL. Just start at the section named Using the stylesheet. |
Next Page:
The
basic concepts
