Skip to main content
August 6, 2018

Learning to love nesting in structured content

Nested content is one of the biggest differences between structured and unstructured content.

Content in a word processor is generally flat. Although you express hierarchy with headings (Heading1, Heading2, and so on), the document itself is just a list of paragraph tags.

Screen shot of Word file with H1, H2, Normal tags applied in a column on the left side

In this simple example, you can figure out where each question/answer pair starts and stops, but with longer, multiparagraph answers, figuring out the starting and stopping points gets challenging.

In structured content, you use nesting to capture the relationships:

Same document with boxes indicating nested relationships

The FAQ tag indicates that the entire document is an FAQ. Each question and answer pair is contained in an FAQItem, and questions and answers are labeled with tags that indicate the meaning.

In XML code, this document looks something like this:

<Question>What is your name?</Question>
<Answer>My name is John Doe.</Answer>
<Question>What is your quest</Question>
<Answer>To seek the perfect content model.</Answer>
<Question>What is your favorite color?</Question>
<Answer>Blue. No, green!</Answer>

The XML FAQItem tag collects the question/answer pairs. Notice also that the “Question: ” prefix is eliminated. Given a specific Question tag (instead of a generic H2), we can add the prefix when formatting is applied instead of typing it into the text over and over again. If you need to translate your questions and answers, you can even change the text from “Question” to “Pregunta” or “Frage” automatically.

When content is nested, you can:

  • Easily identify a chunk of content and process it
  • Apply metadata to the chunk
  • Mix and remix content chunks

The transition from unstructured to structured content means that we need to re-examine hierarchy so that the nesting can capture content relationships. Otherwise, we just end up with a flat structure that matches the original word processor document.