Skip to main content
March 20, 2023

Unpacking structured content, DITA, and UX content with Keith Anderson

In episode 139 of The Content Strategy Experts Podcast, Sarah O’Keefe and special guest Keith Anderson dive into their experiences with structured content, DITA, and user content.

“My definition of context is anything that affects the cognitive processing of information. […] So, whether you’re consuming information by reading or listening, there are so many factors that affect how you process the context of the content.”

Related links:



  • Floridi, Luciano. The Fourth Revolution: How the Infosphere Is Reshaping Human Reality. 1 edition. New York ; Oxford: Oxford University Press, 2014.
  • Duranti, Alessandro, and Charles Goodwin. Rethinking Context: Language as an Interactive Phenomenon. Cambridge [England]; New York: Cambridge University Press, 1992.
  • Stein, Howard F. Euphemism, Spin, and the Crisis in Organizational Life. Westport, Conn: Quorum Books, 1998.
  • Stein, Howard F. Nothing Personal, Just Business: A Guided Journey into Organizational Darkness. Westport, Conn.: Quorum Books, 2001.


Sarah O’Keefe: Welcome to the Content Strategy Experts Podcast, brought to you by Scriptorium. Since 1997, Scriptorium has helped companies manage, structure, organize, and distribute content in an efficient way. In this episode, we talk about structured content, DITA, and user context. Hi, I’m Sarah O’Keefe, and I’m here with a special guest, Keith Anderson. Keith is a longtime friend and one of the very few people I think in the world who understands both DITA’s structured content and the world of UX content. So Keith, welcome aboard.

Keith Anderson: Hi. It’s good to be here.

SO: Thanks for coming.

KA: Of course.

SO: So first, give us a bit of a background on structured content and DITA and what your sort of experience is in that space.

KA: Oh, okay. So I go back to SGML days when I was working at a telecom company and we were doing structured content back then, and it was mainly in DocBook, but structured content lent itself really well to being repurposed or single sourced, like we used to call it. There was a point where we were actually single sourcing out the instruction sets for online help, for printed documentation, for instructional design, and we also used them for test scripting. So that’s kind of how I understood the power of structured content.

SO: I just want to note that we are still wrestling with single sourcing and learning content and technical content. So having somebody tell us we did this back in the day, pre-DITA is pretty encouraging.

KA: Yeah.

SO: So then digital transformation comes along, and I think you’ve said that there you can’t really apply DITA directly, but you came up with a way of making that work. What does that look like?

KA: Okay, so out of what I would call the mainstream content management systems, out of all of them, only Adobe Experience Manager actually natively supports DITA. And Adobe has DITAWORLD every year, but when you look at content management systems like SharePoint and Sitecore, they don’t support it. So I was brought on board to do a project a few years ago. It was an online help system, and when I did the content audit, it was like two and a half billion words, and they had been maintaining it in some old tool and then they were just porting it over to the online help system. But it was taking a lot of time. They were trying to move everything into Sitecore. And a few things that I noticed, one was they weren’t using some of the best Sitecore features, which are inheritance and repurposing content. That’s just built in.

The other thing that they weren’t doing was planning out content to be repurposed. So I got the bright idea that I would use DITA because when we did our design thinking sessions, we kept coming back to that, the fact that this was an online help system and DITA lends itself really well to that. So I came in and I ended up with my own little server and … Let me back up just a second. Sitecore, all it is a fancy interface for a bunch of XML schema. And so I thought, well, theoretically it’s possible to enforce DITA on Sitecore and DITA broke everything else. And I started doing research and I talked to a guy in The Netherlands who told me that the surest way to hell was to try to put DITA in Sitecore.

So what I did to circumvent this was I did content modeling and I came up with the idea of using DITA as a platform independent model, meaning that we use it for terminology and we use it for reference, but we can’t technically implement it. So the platform is not dependent on any of the schemas in DITA. And we did that, and that actually helped quite a bit because it did provide us with structure. And then we were able to set up search hierarchies and things like that on the solar server. Solar is the search engine that ships with Sitecore most times, and it worked out pretty well that way.

SO: So you’re saying that essentially you used the concept of a DITA reuse or something like that, but you implemented it without using the standard DITA [inaudible 00:04:39]?

KA: Right. But it was a really good place to refer to. So we used the DITA vocabulary, we used the idea of how DITA content is separated out into topics, and then I introduced topic-based writing to these authors who had been doing very verbose writing on things that didn’t need to be verbose. So we were able to cut out two thirds of the content just by going through and doing that.

SO: So that’s really interesting because it’s one of the big issues that our clients struggle with is this question of, okay, we have web content and we have DITA content, and how do we put the two together? Or how do we integrate them in some way? So in your work, in addition to looking at these sort of structured concepts and putting them in, even if you’re not strictly speaking using DITA or I guess even if you’re not using DITA period, you focused very much on context and the relationship of content and context. So I guess we have to start with the basics, which is what is context or what is your definition of context?

KA: My definition of context is anything that affects the cognitive processing of information. It’s an idea that context is three-dimensional and that, well, the author Luciano Floridi, he created a term called infosphere, and he essentially says that in today’s world, we are living in an infosphere. And it makes a lot of sense because if you imagine context is all around you. So whether you’re consuming information by reading it or you’re listening to it or whatever, there’s so many factors that affect how you process the context of the content. So for example, when I lived in the Chicago area and I took the train downtown every day, I was constantly reading, but was interrupted a lot just by train stops or noise or whatever until I learned to put on headphones just so I could read and focus on that instead of what was happening around me.

So context is very situational. Some things affect you, some things don’t. There’s many, many examples of when you have more context, it completely alters the way that you see something. One example that I could think of is controversial, but it’s Bill Cosby. With all the controversy that’s happened with him, does that for you as an individual, does that affect how you see his life’s work, which was comedy? And so there are factors where context utterly changes things over time. And some things you can control, some things you can’t. I think companies like Comcast who are notoriously hated by most consumers have trust issues regardless of the intent of content writers in the company. And that’s a context those writers cannot control.

SO: So they have no goodwill and that’s their context.

KA: Yeah. And the flip side of it is the context of creation. And back in the day when we were doing online help, you remember how we would talk about can you write good online help for bad software? I mean listen, we had late night drunken discussions about this at STC conferences, but I think the modern dilemma for content strategy is can you write good content for a bad corporation or for a bad organization? I think it’s a philosophical issue. How do you build trust? How do you be authentic without engineering authenticity? All of those things are contextual and people pick up on it. It’s like magic. You can tell if somebody has written something under pressure versus they’ve taken their time and they’ve crafted prose. Readers know this and they know it intuitively just because of the way our brains are wired.

SO: So I guess this is really interesting because the canonical example of context is always location. If you’re at this location, you get different kinds of information, or if you look up weather, if you look up weather that corresponds to your current location and there’s a tornado warning or something like that, it will give you a very different experience than if your phone knows where you are but you’re looking up a tornado warning hundreds of miles away. And it’s just like, hey, by the way, there’s a tornado warning, maybe traffic, but if it’s right on top of you, it’s going to give you a different kind of experience because the context matters. Obviously I’m concerned about the tornado no matter what, but if it’s on top of me, I’ve got an immediate, “I need to stay alive” problem as opposed to a sort of more, I guess, academic distant interest. So what does it look like to have DITA or generally what you were describing, DITA like structured content and context? How does that work?

KA: Well, there’s a couple of things that I’ve noticed with it. So context can end up being synonymous with metadata, and that works out really well because then you can have contextual cues built into the metadata for people who want to dig deeper. But when you’re writing agnostic content, so when you’re chunking and you’re putting things in structure and you’re writing agnostic content, that content usually gets assembled almost like a stack of Jenga pieces and it’s put together. And so if you repurpose my instructions, and you repurpose a concept topic like in DITA, and you put concept of procedures together, they could be written by two different authors, the style of the pros and all of that needs to be under really strict editorial control for consistency purposes. But with some of the projects that I’ve seen lately, what Microsoft is doing with Microsoft Viva, another good example is Notion. I don’t know if you’re familiar with Notion, but you notice these building blocks and you build things on top of each other and you can have different contributors all building onto the same thing.

All of that stuff taken as a whole is how readers actually take in the information. So inconsistencies in those building blocks will be evident. So one way to handle that is definitely having strict editorial guidelines and following a way to do it. But the other thing too is to have metadata and have enough content to orient the users to the whole piece of what they’re about to read. Every page is page one idea of producing content.

The other thing that I’ve noticed is that when you take agnostic content and you don’t really give it a lot of thought, sentence construction starts to fail because good writing is like you write one sentence after the other, but each one is building in anticipation of what the reader is looking for. And so you’re trying to build the anticipation and then you’re trying to reward the reader by continuing to read. Very hard to do whenever you have chunks and different people are working on chunks.

SO: Yeah, it’s interesting because I don’t think I’ve ever thought about the … We think about the emotional state of our readers, but I don’t know if we’ve connected that to the idea of context. But certainly in technical communication, the generalized assumption is that somebody who is looking something up in the docs or for that matter in the knowledge base is annoyed or frustrated or angry because they’re blocked. The only reason they’re looking in the docs is because they’re trying to do a thing and they can’t do the thing and they need help. So they are somewhere on the continuum from annoyed to having a tantrum. And it makes for a very difficult writing challenge because as you said, they’re not going to give you the benefit of the doubt. So here we are. So what does that look like? I mean, what does it look like to integrate the ideas around context into your overall content strategy?

KA: Well, what I’ve been working on, on the side is developing a universal context model that should be conjoined with standards like DocBook and standards like DITA. And the context model would help drive or maybe not drive, but guide authors as they’re writing as to what should happen next. I’ll think of a completely non-technical example, but something that everybody probably understands is with all of the police shootings and things that have happened in recent years, I don’t know if you’ve ever seen where the police reports get changed, and then they get released again, and then they’ll update them again. And a lot of this has to do with officer trauma, it has to do with different witnesses, everybody’s on an adrenaline rush when they’re trying to get the paperwork started, then people remember things later. The problem with that is that a lot of police reports are free form narratives. They’re not scripted.

So in some ways, the old school green screens, like call centers used to use with scripting, worked a lot better because it guided somebody down to where they needed to be to get something done. So having a context model that kind of underlies the content and it helps drive form fields and things like that, I think that’s critical for the content of the future because as artificial intelligence is growing and language learning models are expanding, they still need guidance and they need human interaction. And I almost think that it’s better that the machine learning happens within a more closed system as opposed to learning like what’s happening with ChatGPT, where it’s just all of the internet ever is what the chatbots are learning. And I don’t think that’s doing anybody any good. I’ve seen all kinds of horror stories about it already, and I think Microsoft just released their demos for Bing just a few weeks ago. And the horror stories are, I see one in the news just about every day.

SO: So what kind of challenges do you see lying ahead? What are you trying to achieve with connecting context into content strategy? And what does that look like? What kind of interesting challenges do you foresee coming?

KA: I think it’s a way of building trust. So let’s take journalism. So if you look at really good reporting that you see, you realize that there is institutional knowledge that larger publications, New York Times, The Washington Post, they all have that institutional knowledge, and we make assumptions based on their reputation that they have an editorial process. But because of the way politics have kind of become so divisive, a lot of the articles and things get picked apart.

A context model on the other hand, might have reporter notes, might have direct quotes from anonymous sources, and then you might have editors who sign off on it and it’s all part of the metadata that maybe you want to know more about the story that you just read, you could actually access. And I don’t think there’s anything wrong with even tying a context model to blockchain for trust purposes. This editor who works for this organization, has this many years in, and it’s almost like having a reputation server to help provide trust. So that way you’re able, as a reader, to weigh how much you trust the news source based on the metadata rather than just taking the article at face value.

SO: Well, you’ve given us a lot to think about because, and I suspect we could go on for another 20 minutes or much, much, much longer, but I think we’ll leave it there for now. Keith, thank you. This was really, really interesting.

KA: I’m glad to be here.

SO: Yeah, a whole bunch of new ideas and we’ll leave some additional resources in the show notes, including I believe Keith’s website and some other bits and bobs that should be useful to people listening to this podcast. And with that, thank you for listening to the Content Strategy Experts Podcast, brought to you by Scriptorium. For more information, visit or check the show notes for relevant links.