Strange bedfellows: InDesign and DITA
or, What you need to know before you start working on a DITA to InDesign project.
There are a lot of ways to get your DITA content rendered into print/PDF. Most of them are notoriously difficult; DITA to InDesign, though, may have the distinction of the Greatest Level of Suck™.
InDesign XML formats
InDesign provides several XML formats. InDesign Markup Language (IDML) is the most robust. An IDML file is a zip container (similar to an EPUB). If you open up an IDML archive, you’ll find files that define InDesign components, such as pages, spreads, and stories. If you save a regular InDesign file to IDML, you can reopen the IDML file and get back your InDesign file, complete with layouts, graphics, formatting, customizations, and so on.
IDML is both a file format and a markup language. The IDML language is used inside the IDML file. In addition, a subset of IDML markup is used in InCopy files (ICML). Where IDML can specify the entire InDesign file, ICML just describes a single text flow.
(There is also INX, but that format is for older versions of InDesign and has now been deprecated.)
If you are planning to output from DITA to InDesign, you probably want ICML. The IDML language is used in both IDML and ICML files. The IDML specification is available as a very user-friendly PDF on Adobe’s site. I spent many not-glorious hours plowing through that document.
My best tip: If you need to understand how a particular InDesign component is set up in IDML, create a small test file and then save the file out to InCopy (ICML) format. This will give you an almost manageable snippet to review. You’ll find that InDesign includes all possible settings in the exported file. When you create your DITA-to-ICML converter, you can probably create a snippet that is 90 percent smaller (and includes much less stuff). The challenge is figuring out which 10 percent you must keep.
Understanding the role of InDesign templates
Use an InDesign template to specify page masters, paragraph styles, character styles, tables styles, and more. This template becomes your formatting specification document.
To import XML content, do the following:
- Create an ICML/IDML file that contains references to paragraphs and other styles (more on this later).
- In InDesign, open a copy of the template file.
- Place the ICML file in your template copy. The style specifications in the template are then applied to the content in the ICML and you get a formatted InDesign file.
Of course, this nifty three-step procedure elides many months of heartbreak.
The mapping challenge
A basic paragraph, in DITA, looks like this:
Paragraph text goes here.
The equivalent output in IDML is this:
Paragraph text body goes here.
Some things to notice:
- The inline formatting (CharacterStyleRange) is specified even when there is no special formatting.
- The content is enclosed in a <Content> tag.
- The <Br/> tag toward the end is required. Without it, the paragraphs are run together. In other words, if you do not specify a line break, InDesign assumes that you do not want line breaks between paragraphs.
- Extra whitespace inside the <Content> tag (such as tabs or spaces) will show up in your output. You do not want this.
- Managing space between paragraph and character tags is highly problematic.
Other important information:
- You must declare the paragraph and character tags you are using at the top of the IDML file in the RootParagraphStyleGroup and RootCharacterStyleGroup elements, respectively.
- You cannot nest character tags in InDesign. Therefore, if you have nested inline elements in DITA, you must figure out how to flatten them:
<b><i>This is a problem in InDesign</i></b>
You have to create combination styles in InDesign.
- Generally, you will have more InDesign paragraph styles than DITA elements because DITA (and XML) have hierarchical structure. For example, a paragraph p tag might be equivalent to a regular body paragraph, an indented paragraph (inside a list), a table body paragraph, and more. You have to use the element’s context as a starting point for mapping it to InDesign equivalents.
- In addition to using hierarchical tags, if you want to maintain compatibility with specializations, you must use class attributes rather than elements for your matches. That leaves to some highly awkward XSLT templates match statements.
- In addition to paragraph and character styles, you need to declare graphics, cell styes, table styles, object styles, and colors. (There may be more. That’s what I found.)
Tables are not your friend. InDesign uses a a particularly…unique table structure, in which it first declares the table grid and then just lists off all the cells. The grid coordinates start at 0:0. (Most “normal” table structures group the cells into rows explicitly.)
cell content goes here …
As you can see, this gets complicated fast.
There is so much more, but I think you get the idea. It is definitely possible to create a DITA to InDesign pipeline, but it is challenging. If you are looking at a project like this, you will need the following skills:
- Solid knowledge of InDesign
- Solid knowledge of DITA tag set
- Ability to build DITA Open Toolkit plugins, which means knowledge of Ant and XSLT at a minimum
The open source DITA4Publishers project provides a pipeline for output from DITA to InDesign. We looked at using it as a starting point in mid-2013. At the time, we found that it would be too difficult to modify DITA4Publishers to support the extensive customization layers required by our client.
Our DITA to InDesign converter is a DITA Open Toolkit plugin (built on version 1.8). It supports multiple unrelated templates for different outputs and specialized content models. It also includes support for index markers, graphics, and other items not addressed in this document. Scriptorium is available for custom plugin work on InDesign and other output types. For more information, contact us.