DITA to InDesign: the gruesome details
We’ve written before on what lurks beneath the surface of an InDesign file, and how drastically it differs from the DITA standard. When you’re looking at going from DITA to InDesign, though, there’s a lot that you need to take into consideration before you jump in.
DITA separates formatting from content, but formatting content is one of the most powerful features that InDesign offers. You need to prepare your DITA content for the transtition from a no- or low-design environment to a high-design platform. You also need to ensure that your InDesign environment is ready, or you’ll wind up with inconsistently-formatted content, or worse, output that will crash the program when you try to import it.
The DITA side
Taxonomy: You need to make sure that you know your content. InDesign offers a wide range of ways to format your content, but there’s not always a direct mapping from DITA. For example, a paragraph element could have a basic body style applied, or perhaps it needs a style with a different margin. How do you determine this?
- Robust metadata will allow you to identify elements that need to be treated differently. The quickest way is to use the outputclass attribute, but for subtle variations on a style, you may need to consider…
- Specialization allows you to define custom structures. If you have a type of admonition that lacks a label and adds detail to text nearby, you might create a callout element.
- Don’t forget the stock offerings of the DITA specification. Images in particular can already specify things like alignment, which may fulfill your needs.
Information overload: Since it’s a desktop publishing (DTP) platform, InDesign takes some shortcuts when it comes to some things. Images, in particular, are a challenge. When you add images to your DITA content, you need to be sure to include both height and width information. This is due to the way that InDesign stores image display information. Rather than saying that an image appears at a point in the document and is X pixels wide and Y pixels high, InDesign identifies an anchor point, then a series of four points that describes a frame, and then places the image within it. Without both height and width, or an image format that you can draw those dimensions from, you’ll have trouble defining how the image displays. The moral of the story: if you have the information available, you should include it.
Just plain weird stuff: While Adobe has made the IDML standard public, InDesign itself isn’t anticipating someone coming along and placing raw code into a template. This results in some very strange behavior.
- If you have a paragraph element that ends with bolded text, when you place your output into a template, all of the text following that paragraph element will be bolded until InDesign finds another character style element.
- If something goes wrong with your output and InDesign doesn’t like it, one of two things will happen: the offending content will be dropped, or InDesign will crash without any error. Debugging this can be an exercise in patience.
The InDesign side
The most important part of preparing the InDesign portion of your workflow is getting your templates in order. They should either be prepared before you begin working on your DITA taxonomy requirements, or developed alongside them.
- Do you need more than one template, or can you use a master template? If you need specific layouts or master pages, you’ll need multiple templates. If the paragraph, character, or object styles between those templates differ, you’ll need to communicate that to whoever is working on your plugin.
- How well-defined are your object styles? You need to take into account not only things like margins, but also word wrap.
- Do any of your style names have special characters in them? You need to avoid that. The style declarations on the DITA side need to be escaped if so, and if they’re not escaped properly, InDesign will crash when you try to place your content into the template.
- Do your paragraph styles have their hyphenation properties set up correctly? If you know you have tables that will be narrow, you need to be careful about this. If the text in a table cell is too short to become overset, but long enough to hyphenate and then become overset, InDesign will crash when you try to place your content into the template.
While transforming DITA into an ICML file will allow you to quickly get your content into InDesign, it isn’t a smart process.
- Since an ICML file lacks any kind of page information, the only page breaks that will appear are those that are dictated by your paragraph styles.
- An image only knows where its anchor point is relative to the body content it appears near. This means that if you have multiple images in close proximity, there’s no way to prevent them from overlapping.
- When you auto-flow content into a page in a template, it uses the same master page throughout. If you have sections of your content that require a different master page, you’ll have to apply it by hand.
Despite these limitations, being able to leverage DITA’s portability and reusability with InDesign’s high-design environment remains a tantalizing prospect. PDF allows for quick, consistent publishing of your content, but any edits require new output, and any formatting updates require changes to the plugin. If you have a production pipeline that utilizes InDesign and you value the fine control that it grants you, a DITA to InDesign workflow may be worth it.
I recently gave a presentation on using desktop publishing applications to format DITA content (http://www.balisage.net/Proceedings/vol17/html/Cuellar01/BalisageVol17-Cuellar01.html), as we at Quark use a similar production process using QuarkXPress Server for layout of DITA content (and have a DITA Open Toolkit plugin that handles the transformation process for our customers). Thus, it’s great to hear that others have successfully managed the process using InDesign.
Thanks for describing some of the limitations. One of the comments to my presentation was that someone had tried going through IDML but was frustrated at the lack of support in IDML for constructs like footnotes, which probably goes back to your comment about there not always being a direct mapping. Did you experience a similar frustration or feel there were enough workarounds to produce a well-formatted documented?
Initially, yes, I found myself frustrated at the process. The IDML specification is less than friendly to read, and attempting to develop new features resulted in either a silent crash or content not appearing. The real turning point in my understanding was realizing that while InDesign might implement a feature differently, it still implemented it. The trick is just figuring out how they did it. Generally, this meant that I mocked up content in InDesign the way I wanted it to look, saved it out, then opened up the IDML and looked at the raw spreads to see how they wanted it done.
Developing a DITA to InDesign transform is definitely a unique challenge, but I wouldn’t say that a lack of support for some features was a hurdle; more that the way those features were implemented forced me to look at how I treated the content in a new way.