Table of contents

Abstract

What is structured authoring?

What is XML?

The impact of structured authoring on a publishing workflow

Workflow options

Roles and responsibilities

Developing a business case for structured authoring and XML

Does your organization need structure?

Implementing a structured workflow

Summary

What is structured authoring?

Structured authoring is a publishing workflow that lets you define and enforce consistent organization of information in documents, whether printed or online. In traditional publishing, content rules are captured in a style guide and enforced by (human) editors, who read the information and verify that it conforms to the approved style. A few simple examples of content rules are as follows:

In structured authoring, a file—either a document type definition (DTD) or a schema—-captures these content rules. Authors work in software that validates their documents; the software verifies that the documents they create conform to the rules in the definition file.

Consider, for example, a simple structured document—a recipe. A typical recipe requires several components: a name, a list of ingredients, and instructions. The style guide for a particular cookbook states that the list of ingredients should always precede the instructions. In an unstructured authoring environment, the cookbook editor must review the recipes to ensure that the author has complied with the style guideline. In a structured environment, the recipe structure requires and enforces the specified organization.

Elements and hierarchy

Structured authoring is based on elements. An element is a unit of content; it can contain text or other elements. You can view the hierarchy of elements inside other elements as a set of nodes and branches.

Elements can be organized in hierarchical trees. In a recipe, the ingredient list can be broken down into ingredients, which in turn contain items, quantities, and preparation methods, as shown in Figure 1.

recipe hierarchy

Figure 1: Recipe hierarchy

The element hierarchy allows you to associate related information explicitly. The structure specifies that the IngredientList element is a child of the Recipe element. The IngredientList element contains Ingredient elements, and each Ingredient element contains two or three child elements (Item, Quantity, and optionally Preparation). In an unstructured, formatted document, these relationships are implied by the typography, but unstructured publishing software (a word processor or desktop publishing application) does not capture the actual relationship.

In structured documents, the following terms denote hierarchical relationships:

Element attributes

You can store additional information about the elements in attributes. An attribute is a name-value pair that is associated with a particular element. In the recipe example, attributes might be used in the top-level Recipe element to provide additional information about the recipe, such as the author and cuisine type (Figure 2).

attributes capture additional information about an element

Figure 2: Attributes capture additional information about an element

Attributes provide a way of further classifying information. If each recipe has a cuisine assigned, you could easily locate all Greek recipes by searching for the attribute. Without attributes, this information would not be available in the document. To sort recipes by cuisine in an unstructured document, a cook would need to read each recipe.

Formatting structured documents

To format structured documents, you associate formatting with particular elements or element sequences. Such formatting is usually highly automated; once an author assigns elements to content, the formatting is implemented automatically to create the final output files.

 

Next page:
What is XML?


Scriptorium Publishing | Post Office Box 12761 Research Triangle Park, NC 27709 | (919) 481 2701 | info@scriptorium.com