Skip to main content
July 30, 2010

Using Ant to find a needle in a haystack

Many content management systems (CMSs) take over the responsibility of file naming. For the most part, this is fine and is actually necessary for maintaining cross-references and conrefs within the CMS. When you use the CMS to build a DITA map, the CMS uses its own names in the <topicref> elements. In the final output, all the links work and it’s not really important that the file names aren’t human-readable; they don’t need to be.

Except in the case where your users (or your Help system) requires some files to have specific names. Then you’re up the proverbial creek.  Or are you?

Here’s a situation I encountered.  The DITA topics for a help system are managed in a CMS, which uses its own file names when exporting files. I’m customizing the DITA Open Toolkit HTML Help transformation to create a CHM file. The help project file (HHP) refers to an HTML file with a specific, static file name (a client requirement). I’m using the DITA OT, so I’m using Apache Ant to build the help. How can I find the specific HTML file (with a CMS-generated name) and make it available to the HHP file with a static name?

The solution involves the ID attribute for the source topics.  The dita2xhtml transform preserves the source topic’s ID (“my_topic_id” in this example) in an HTML <meta> tag:

<meta name=”DC.Identifier” content=”my_topic_id”>

With a combination of the <copy> task, the <contains> selector, and the
<mergemapper> file mapper, I can get what I need:

<copy todir=”${output.dir}” overwrite=”true”>
   <fileset dir=”${output.dir}” includes=”**/*.html”>
      <contains text=’name=”DC.Identifier” content=”my_topic_id”‘ casesensitive=”no”/>
   </fileset>
   <mergemapper to=”MyTopic.html”/>
</copy>

The <fileset> with the <contains> selector finds the HTML file containing the string I’m looking for (provided the ID is unique). The <mergemapper> then tells Ant that the copy of the file must have the static name “MyTopic.html”.

Now a target file with the appropriate name exists and the HTML Help Compiler can run without complaining.