What is structured authoring?
Structured authoring is a publishing workflow that lets you define and enforce consistent organization of information in documents, whether printed or online. In traditional publishing, content rules are captured in a style guide and enforced by (human) editors, who read the information and verify that it conforms to the approved style. A few simple examples of content rules are as follows:
- A heading must be followed by an introductory paragraph.
- A bulleted list must contain at least two items.
- A graphic must have a caption.
In structured authoring, a file—either a document type definition (DTD) or a schema—-captures these content rules. Authors work in software that validates their documents; the software verifies that the documents they create conform to the rules in the definition file.
Consider, for example, a simple structured document—a recipe. A typical recipe requires several components: a name, a list of ingredients, and instructions. The style guide for a particular cookbook states that the list of ingredients should always precede the instructions. In an unstructured authoring environment, the cookbook editor must review the recipes to ensure that the author has complied with the style guideline. In a structured environment, the recipe structure requires and enforces the specified organization.
Elements and hierarchy
Structured authoring is based on elements. An element is a unit of content; it can contain text or other elements. You can view the hierarchy of elements inside other elements as a set of nodes and branches.
Elements can be organized in hierarchical trees. In a recipe, the ingredient list can be broken down into ingredients, which in turn contain items, quantities, and preparation methods, as shown in Figure 1.

The element hierarchy allows you to associate related information explicitly. The structure specifies that the IngredientList element is a child of the Recipe element. The IngredientList element contains Ingredient elements, and each Ingredient element contains two or three child elements (Item, Quantity, and optionally Preparation). In an unstructured, formatted document, these relationships are implied by the typography, but unstructured publishing software (a word processor or desktop publishing application) does not capture the actual relationship.
In structured documents, the following terms denote hierarchical relationships:
- Tree—The hierarchical order of elements.
- Branch—A section of the hierarchical tree.
- Leaf—An element with no descendant elements. Name, for example, is a leaf element in Figure 1.
- Parent/child—A child element is one level lower in the hierarchy than its parent. In Figure 1, Name, IngredientList, and Instructions are all children of Recipe. Conversely, -Recipe is the parent of Name, IngredientList, and Instructions.
- Sibling—Elements are siblings when they are at the same level in the hierarchy and have the same parent element. Item, Quantity, and Preparation are siblings.
Element attributes
You can store additional information about the elements in attributes. An attribute is a name-value pair that is associated with a particular element. In the recipe example, attributes might be used in the top-level Recipe element to provide additional information about the recipe, such as the author and cuisine type (Figure 2).

Attributes provide a way of further classifying information. If each recipe has a cuisine assigned, you could easily locate all Greek recipes by searching for the attribute. Without attributes, this information would not be available in the document. To sort recipes by cuisine in an unstructured document, a cook would need to read each recipe.
Formatting structured documents
To format structured documents, you associate formatting with particular elements or element sequences. Such formatting is usually highly automated; once an author assigns elements to content, the formatting is implemented automatically to create the final output files.
Next page:
What is XML?
