“Once you start down the DITA path, forever will it dominate your destiny”

Sarah O'Keefe / Opinion8 Comments

Eliot Kimber has a lovely article on using DITA for narrative documents. I’m trundling through it, nodding in agreement, and then we have this horror:

[…] DITA offers at least two compelling advantages over any other candidate XML application:

  1. The initial cost of ownership is low, approaching zero, and the ongoing cost of ownership is low.
  2. It offers a number of sophisticated features in terms of modularity, extensibility, and linking that either are not provided by other applications or would cost a prohibitively large amount to build from scratch.

That is, the cost of applying DITA is almost always going to be significantly lower than the cost of any alternative (and at worst will be no more expensive than any other alternative).

Now, he does qualify this statement by saying that these assertions apply only if DITA is a reasonable fit for your problem. But the overall thrust of the argument appears to be that since DITA can do narrative documents (which it was emphatically not designed for), it can potentially be applied to an enormous new set of content.


Before I begin today’s DITA-bashing session, I need to point out that we are currently using DITA for several projects here at Scriptorium. DITA slices! DITA dices! DITA advocacy raises your IQ, improves your health, and makes you irresistible. I like DITA just fine.

Moving right along…

“1. The initial cost of ownership is low, approaching zero, and the ongoing cost of ownership is low.”

Just because it’s free doesn’t mean it’s cheap. The default output from the DITA Open Toolkit ranges somewhere between unattractive (HTML) and fugly (PDF). If you care about the appearance of your final documents, you are going to have to do a lot of work to get the look and feel you want. And although the OT offers a starting point, customizing it is kind of like a trip to the dentist. The impacted-wisdom-tooth-removing kind of trip.

Getting your output working properly is Not Easy because of the, er, unique design of the OT. If the set of tags you need is small, you might be better off building a nice petite NovelML and then writing the transformations you need for NovelML instead of wrestling with DITA’s complexities.

“2. It offers a number of sophisticated features in terms of modularity, extensibility, and linking that either are not provided by other applications or would cost a prohibitively large amount to build from scratch.”

I agree that DITA has some lovely features in this area. However, I fail to see how they apply to the example at hand — a narrative document such as Moby Dick. If you need modularity, extensibility, and linking features, you should consider DITA. If you don’t, then these features will just get in the way.

That is, the cost of applying DITA is almost always going to be significantly lower than the cost of any alternative (and at worst will be no more expensive than any other alternative).

If DITA is overkill for your requirements, then applying DITA may not be cheaper.

But the issue that upsets me the most is this: when you attack a problem by assuming (or hoping) that DITA will work, you necessarily look for DITA features you can use. You may not think carefully about non-DITA features that you might like to have. For fiction content, I can think of several things that would be quite useful (and for which DITA offers no immediate support):

  • For a book that is part of a series (like a science fiction trilogy), a listing of the entire series and an indication of where the current book falls in the series.
  • Metadata to identify the point of view. Many novels switch from one narrator to another, or from a first-person point of view to an omniscient point of view. It would be lovely to filter the content to see only the first-person content (after reading the book from cover to cover as the author intended).
  • Similarly, metadata that helps with scene location and time could be invaluable for studying literature written with numerous flashbacks. The Time Traveler’s Wife and anything by Jasper Fforde come to mind.
  • The ability to index by character occurrence. This is more often seen in nonfiction books, especially biographies. But imagine scanning the entire Harry Potter series for scenes with Severus Snape to determine whether his ultimate allegiance was consistent.

Of course, you could pervert and/or specialize DITA to support these and other requirements. But if you start with a DITA-shaped box, how likely is it that you will think carefully about the possibilities outside the box?

As Eliot says, the advantages of DITA can be significant. But I fear that a generation of documents will be crammed into DITA, resulting in documents that are not as well structured as they need to be.

I will now await my smackdown from the DITA Disciples.


DITA Dissident

About the Author

Sarah O'Keefe


Content strategy consultant and founder of Scriptorium Publishing. Bilingual English-German, voracious reader, water sports, knitting, and college basketball (go Blue Devils!). Aversions to raw tomatoes, eggplant, and checked baggage.

8 Comments on ““Once you start down the DITA path, forever will it dominate your destiny””

  1. Sarah’s objections are legitimate but somewhat miss my point. But I also realized, in reading her comments, that I took a few things for granted.
    First, no standard, out-of-the-box XML application that might be usable for publishers will provide appropriate formatting or deliverable production. That is, the effort needed to generate published deliverables from XML is more or less a constant regardless of what XML markup scheme you start with.
    I will also mention that I’m on record as being in the process of trying to develop a general InDesign-based publishing system for DITA that will go a long way toward providing a much higher quality starting point than the current free stuff provided with the Open Toolkit can provide. There’s no technical barrier, it’s just a matter of finding the time to do it.
    The aspects of cost of ownership and use to which I most refer are the initial design and implementation of the markup itself: DITA absolutely makes that as low cost as it could be, both in terms of initial cost (the base models are reasonable starting points for at least experimentation) and ongoing cost (the specialization feature makes it a easy as it is possible for it to be to develop new and variant document types that will work with DITA-aware processors without modification required [modification is only needed if your specialized markup also requires special processing, which often it does not]).
    I’m not sure why you think that defining specialized metadata would be in some sense “perverting” DITA: that’s what the DITA 1.1 “data” element is designed for: to let you define and use whatever metadata you need, in almost any context (granted, this is new in DITA 1.1 so if you’re used to DITA 1.0 constraints I can understand your frustration). See the DITA 1.1 bookmap specialization for an example of defining a completely new set of publication-specific metadata elements.
    Of course, Moby Dick is a trivial or contrived example: I just wanted some quicky-available content to use in my example.
    But if you think of slightly more sophisticated content you should see that it might be at least as good a fit as DocBook.
    Another point that I didn’t have time to make is that there is a significant value to having an all-DITA environment, meaning that all your content can take advantage of the same base DITA processing infrastructure. Some of your publications will really need that power and some won’t. But since it doesn’t cost any more to use DITA for the less-sophisticated publications, it makes sense to give it serious consideration.
    Also, I will suggest that even documents you didn’t think would need things like modularization or sophisticated linking or re-use might just turn out to be more sophisticated than you thought. While I don’t normally advocate engineering to potential requirements as a rule, in this case the engineering is already there to use and it doesn’t cost anything to have it at your disposal

  2. Thanks for the response. Some response to your response…
    I think that the effort of creating reasonable-looking PDF through the OT could be *higher* than the effort of building the output from scratch — whether through a DITA-to-InDesign, DITA-to-FrameMaker, DITA-through-AntennaHouse, or whatever. There are obviously a lot of variables here, but if the formatting requirements are significantly different from what’s provided by default OT output, I don’t see the OT adding a whole lot of value. It looks as though we’re actually more or less in agreement on this point.
    Let me try a stupid analogy in an effort to clarify my main point: frozen pie crust.
    Now, frozen pie crust is a wonderful thing. It saves lots of time in baking, and it lets you focus on the main event (the pie filling). I happen to think that my scratch-made pie crust is better than the store-bought stuff, but the difference may only be apparent to a foodie. Most people won’t notice.
    However, when I start my baking process with a frozen pie crust, there are some issues. For example, key lime pie is really much better with graham cracker crust. I can’t make flour-based frozen pie crust into graham cracker crust.
    I can specialize pie crust into tarts, crostata, deep dish pie, lattice pie, turnovers, and lots of other things, but there are limits.
    Furthermore, and perhaps more importantly, what I start with a frozen pie crust, I assume that the answer to my problem is, well, something in the pie category.
    I have already eliminated angel food cake, creme brulee, buche de Noel, and strudel from consideration.
    If you choose frozen pie crust, you limit your dessert options.
    Excuse me while I go look for a snack. I’m suddenly quite hungry.

  3. With regard to the value of the OT: it’s important to remember that the OT operates in two phases.
    The first phase applies all the really difficult DITA-specific, and output-inspecific, processing to do things like resolve map references, resolve content references, and apply props= values (conditionality, labeling, etc.). It is this processing that would be very expensive to build yourself (I know, I’ve done it in the past) and likewise expensive to define the markup that supports it. This is the out-of-box part of DITA that represents a lot of its immediate value as a base: you get a lot for free.
    The second phase is output-specific. The second phases (the various Toolkit plug-ins) you get for free vary in quality, from pretty good (HTML) to barely usable (PDF2). But they’re free.
    Any processing of DITA for any output is going to want what the first phase does. But it’s free.
    The data processing you need to do to transform the output of the first phase into some rendition is not significantly different from what you would need for any other XML document type that otherwise met your needs. That is, taking DocBook to InDesign, or some custom schema to InDesign will not be significantly easier or harder than taking the pre-digested output of Phase 1 into InDesign. So that cost is essentially a constant regardless of what markup base you start with (have you ever tried to adapt the DocBook XSLT code to your own processing needs? It’s really hard, certainly no easier than adapting any of the Toolkit-supplied stuff. The main difference is that there’s a book about how to do the DocBook work and not yet one for DITA, but that reflects the fact that DocBook has been around for more than 10 years).
    So given that the Toolkit is free, and it works, and you can do whatever you want with the output of phase 1 it doesn’t really change the implementation cost to use it instead of not using it (that is, using a non-DITA-based XML application).
    Of course, this assumes that your requirements are sophisticated enough that something trivial wouldn’t be enough, but for Publishers it would be very rare indeed that their requirements were not that sophisticated, even for apparently “simple” publications.
    As for the pie crust analogy, I think that DITA is much closer to a set of ingredients from which you can make a variety of different kinds of crusts or to which you can easily add some new ingredients to make novel crust variants, rather than simply a frozen pie crust. Perhaps its packaged in such a way that if all you do is open the package and add water you get normal pie crust, but if you separate the parts in the package you have what you need to make graham crust or shortbread or even tortillas.

  4. Essentially, yes.
    If we were discussing DITA 1.0 I would be much more in sympathy with your analysis, but I think DITA 1.1 changes the equation a good bit, more than might be initially apparent.
    Another important development over the last year is the number of products that have released useful DITA support at remarkably low prices, such as both Syntext Serna and OxygenXML providing useful, graphical editing of DITA content with complete support for specializations and minimal setup effort for local shells or specializations. That alone goes a long way to lowering the cost of usage.
    In particular, being able to create a local specialization and have your editor “just work” really changes things because now you can experiment very quickly with new markup designs without having to make any initial investment in authoring tool setup or customization.
    That was definitely not the case even six months ago.

  5. What a vivid word picture with the key lime pie analogy.
    And yes, it made me hungry also.
    I’m wrestling DITA around in my brain for its importance also however…
    I might just stick in my eLearning hidey-hole for this discussion.
    And go find some pie. I’m now hungry for some reason or another.

  6. Sarah, you’re a treasure!
    It’s funny how anyone who says anything about DITA that might remotely be considered a criticism is SO careful to state something like:
    “… DITA slices! DITA dices! DITA advocacy raises your IQ, improves your health, and makes you irresistible. I like DITA just fine.”
    Considering how quickly the DITArati respond to criticism, it seems sometimes as if the TC has a SWAT team in place ready to deploy at a moment’s notice. They’ve been quite successful, evidenced by our almost reflexive tendency to qualify our statements on DITA.
    In any case, I so enjoyed your post that I was actually moved to post on my blog (a rare occurrence).
    See you at WritersUA!

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.