Skip to main content
DITA

Tech Tips: Quick Word to DITA table conversion

The other day I had to convert a large table from Word to DITA. I started looking at Word XML output and thought about transforming it with XSL (which I have done in the past), but that seemed to be too much trouble for this document. Then I remembered a technique an old SQL coder showed me for loading large amounts of data into a SQL table.  I realized this technique could be readily adapted to DITA.

Read More
DITA Webinar

Webcast: Demystifying DITA to PDF publishing

When you implement a DITA-based workflow, you face myriad new challenges, such as getting accustomed to topic-based writing, exploring reuse strategies, and specialization. The most difficult technical obstacle is usually setting up a PDF/print publishing workflow. The DITA Open Toolkit provides very basic PDF output, but for organizations who require attractive, professional-looking PDF content, extensive and expensive customization is required. FrameMaker is easier to configure than the Open Toolkit and produces lovely PDF files, but can you work around the limitations of the DITA support? InDesign offers the highest quality typography but has significant limitations in working with structured content. This session discusses the advantages and disadvantages of each approach to extracting PDF from DITA content.

This session is intended for individuals who are considering a DITA implementation and expect to need PDF output. Basic familiarity with DITA, XML, and related technologies is helpful but not required.
NOTE: During the recording, the presenters will mention polls. You will not see these polls while viewing the recording, but the presenters will describe the results.

Read More
DITA Webinar

2010: A DITA Odyssey

When you’re considering tools for authoring DITA content and creating output, there are many choices to evaluate. To make your journey toward DITA implementation easier, Scriptorium is offering free webinars in early 2010 to show you how three tools handle DITA-based information.

Odysseus in front of Scylla and Charybdis by Johann Heinrich Füssli (Wikipedia)

Odysseus in front of Scylla and Charybdis by Johann Heinrich Füssli (Wikipedia)

On January 19, Sarah O’Keefe will show you how MadCap Flare supports DITA constructs, and on February 16, Simon Bate will demonstrate the DITA features in the oXygen XML editor. On March 16, Scott Prentice of Leximation will demonstrate how the DITA-FMx plugin works with FrameMaker 9.

As an added bonus, attendees can win a free license of the tool shown during each demo! For more information about these sessions and to register, visit our events page.

If there are other topics you’d like to see covered in later free webcasts, please send suggestions to [email protected].

Read More
DITA-OT Industry insights

The long and winding roads from DITA to PDF

by Sheila Loring

DITA XML is of little use to readers unless it’s converted to some kind of output. The DITA Open Toolkit (DITA OT) provides transforms and scripts that convert DITA to PDF output and a long list of other formats.

Producing PDF output from DITA content can be challenging. DITA XML is converted to an XSL-FO file, a combination of content and formatting instructions. You must know XSL-FO to customize the PDF, even just to add simple content such as headers and footers, logos, and so on.

To forgo the programming, you can choose a page layout or help authoring tool, but these tools also have pitfalls. Page layout programs have varying degrees of DITA support. Help authoring tools let you style the PDF through CSS, but you can’t fine-tune page layout as you can in page layout programs.

These are just a few examples we discuss in our white paper “Creating PDF files from DITA content.” Read the white paper online (in HTML or PDF).

Read More
Webinar

New Webinar Series: Things to consider when moving to DITA

Scriptorium and JustSystems are announcing a three-webinar series on preparing to use DITA.

The first two webinars in the series describe the age-old problem of converting legacy content into DITA. Because a great deal of unstructured content is in either Adobe FrameMaker and Microsoft Word, we’re dedicating one webinar to converting Unstructured FrameMaker to DITA and the other to converting Microsoft Word to DITA.

The third webinar describes various re-use strategies you can apply to your DITA content.

The dates and times for the conversion webinars are:

  • Converting Unstructured FrameMaker to DITA – August 25, 2:00pm Eastern time.
  • Converting Microsoft Word to DITA – September 1, 2:00pm Eastern time.

The date and time for the third webinar (DITA reuse strategies) will be announced toward the end of August.

All of the webinars in the series are free, but you do have to register before attending. To sign up, follow this link to the JustSystems web site:

http://na.justsystems.com/webinars.php

Register now!

Read More
Webinar

Learn DITA and XML at your desk

For August and September, our webinar schedule is as follows:
DITA 101, August 18 at 11 a.m. Eastern time
Participants will learn about basic Darwin Information Typing Architecture (DITA) concepts, the business case for implementing DITA, and some typical uses of DITA. This webinar is ideal for those who are considering a move to structured authoring based on the DITA standard. Register
Demystifying DITA to PDF Publishing, September 10 at 11 a.m. Eastern time
When a company implements a DITA-based workflow, the most difficult technical obstacle is often setting up a PDF/print publishing workflow. This session discusses the advantages and disadvantages of using the DITA Open Toolkit, FrameMaker, InDesign, and other options to create PDF output from DITA content. Basic familiarity with DITA, Extensible Markup Language (XML), and related technologies is helpful but not required. Register
What Do Movable Type and XML Have in Common?, September 22 at 11 a.m. Eastern time
The invention of movable type changed the economics of information by making the process of copying a book by hand obsolete. More than 500 years later, XML seems to be doing the same to desktop publishing. But where movable type changed the economics of a mechanical process—creating printed 
copies—XML changes the economics of content authoring, formatting, and customization. This webinar takes a look at how publishing technologies revolutionize the way people consume information and how those technologies affect authors. Register
Each webinar is $20. 
During the sessions, you can interact with the presenter and other students through the chat interface or the audio connection. There is a question-and-answer session at the end of each webinar. The Q&A is not included in session recordings, which are available for download later. Participants in the sessions receive a free recording.
To register for these webcasts, or to purchase recordings of past webinars, go to our online store.

Read More
DITA

Automated trademarking in structured documents – DITA in particular

Unabashed plug warning: The following entry gives a conceptual overview of a solution Scriptorium has implemented for managing trademarks in structured tagging. And we’re proud of it.

You know the problem. According to your style standards, only the first instance of a given trademarked term should display the trademark symbol. Structured documentation allows you to re-use document parts (such as DITA topics) in just about any order you like. In Manual A, the first file containing the trademarked text is, say, Topic A; in Manual B the first file containing the trademarked text is Topic E, which is also used in Manual A. Where do you put your trademark markup, and how do you maintain it when running Manual A and Manual B at approximately the same time?

Maintaining the trademarks by hand adds a level of effort that becomes non-negligible when you start considering a large number of manuals. And the process becomes error prone – those darned human beings. Different writers might tag things different ways, trademarks might escape notice, or markup might be inserted in inappropriate places by accident.

Isn’t this one of those problems that automated documentation was supposed to solve, not create? I once had a professor who said that computers were supposed to handle the work that computers could solve so people could work on the problems that only people can solve.

More than one of Scriptorium’s customers has presented us with this problem, so we know it is not uncommon. We have found a way to deal with the problem in DITA, and we believe that the principle is sufficiently generic to use in non-DITA structures as well.

To begin with, forget conditional processing. It won’t help you with the problem of marking only the first instance of a term. In the example of Manual A, above, setting the condition “Manual A” would still display the trademark in Topic A and Topic E. This is not what your editor wants – and he or she will let you know it in spades if he or she is any kind of editor at all.

Scriptorium’s solution for DITA, in simple outline, is as follows:

  1. Using XSL, go through the ditamaps and remove all trademarking from the document files.

  2. Following a predefined list of trademarked and registered trademarked terms, go through the ditamaps and identify the files that contain each term. Create a temporary file that lists the relevant files in order of book occurrence. (This step prevents having to crawl through the ditamaps more than once.)

  3. Using Perl, iterate through the files listed for each term in the temporary file. Check the occurrence of each instance of the term, in text order, and evaluate whether it is a valid occurrence that requires trademarking. If so, wrap the appropriate trademark markup around it and go to the next trademark. If not, keep going through the text and the list of files until you find a valid occurrence of this trademark.

We possibly could have used XSL instead of Perl for the third step, but Perl’s text manipulation capability is much more robust than XSL’s, so we chose Perl.

In the implementation, the trademarking utility is coordinated by an Ant process. A user runs this utility just before the book is rendered for output. Being in Ant, the trademarking process could probably be integrated into the DITA Open Toolkit build system fairly easily to create a seamless, one-step production process.

There are a number of interesting problems that arise during implementation. For example, in step 3 the process has to evaluate whether the instance of a term is valid for trademarking. Some kinds of non-valid instances of a term in the text might be:

  • The term is in an indexterm tag.

  • The term is in an href attribute.

  • The term is in a title.

  • The term is in a codeblock tag.

You might also encounter a condition where a trademarked term could be both mixed case and all uppercase. Per your style guide, only the first instance of either should be marked, but not the first instance of both. That sort of requirement makes life just a little more interesting for a coder.

In general, the issue of trademarking first instances is not a simple problem to solve, and variations in style requirements will undoubtedly add complexity and challenges to the problem. But that’s what automated documentation is supposed to be good at, right? So we humans can get back to doing the more difficult problems that only people can solve.

I’m not sure – is that really such a good deal?

Read More