DITA: The next generation (podcast)
In episode 83 of The Content Strategy Experts podcast, Gretyl Kinsey and Jake Campbell talk about the next generation of DITA. What happens when you need to update your existing DITA structure?
“When you’re building everything out the first time around, you can do as much user acceptance testing as you want—but the best user acceptance testing is going to be live testing.”
GK: Welcome to the Content Strategy Experts podcast brought to you by Scriptorium. Since 1997, Scriptorium has helped companies manage, structure, organize and distribute content in an efficient way. In this episode, we talk about the next generation of DITA, what happens when you need to update your existing DITA structure? Hello everyone and welcome. I’m Gretyl Kinsey.
JC: And I’m Jake Campbell.
GK: And we’re going to be talking about updating your DITA content structure today so I think we want to start by just briefly talking about DITA itself and the kind of different generations or versions that it goes through for those who are unfamiliar. Jake, can you just give us a little overview of that?
JC: The earliest version of DITA that I’m familiar working with actually started back in 2006, DITA 1.1, and that lacked a lot of the modern conveniences that we’ve become accustomed to in DITA today, particularly when it comes to actually customizing the DITA structure. You weren’t able to do things like specialize attributes, some reuse capabilities I think we’re kind of limited. And now we’ve got a lot of very specialized topic types. We have a broad suite of specializations available out of the box for some purpose built usage, things like the troubleshooting domain or some of the more specialized element, like hazard statement for when a standard note won’t do.
GK: Right. And I think a lot of these kind of differences between each version of DITA because we’ve gone, as you said, from that earliest 1.1 to 1.2 and the now we’re in 1.3, and I think one of the big driving forces behind that has just been this idea of seeing how people use DITA, how people need to use DITA and what changes sort of needs to be made to that out of the box content model to kind of help make sure that all those features are available. There’s definitely been, I think, kind of a few major evolutions that we’ve seen.
JC: Yeah, definitely. And a lot of what’s available now is kind of in response to what people have needed. And we’ve actually seen with some clients who need to move their content model from DITA 1.2 to DITA 1.3 or in some cases DITA 1.1. to DITA 1.3, there are some things that they had specialized or that they had built specific semantic structures around that are now actually part of the base DITA model as of 1.3.
GK: That’s actually a really good segue into the next question I wanted to ask, which is what are some of the reasons that you might want to update your DITA content model? And I think you already kind of touched on that with the idea of being able to include features that you couldn’t before and then suddenly those start to become available in the latest version of DITA.
JC: Yeah. And the kind of most sweeping way that you kind of realize that’s happening when you move into a new version of DITA is some of topic types that become available. I remember when the DITA 1.3 specification was just starting to come out and there were some rumblings about it being released, there was some discussion around the troubleshooting topic type, which is a further specialization of the task type. And there were a lot of people talking about how that was really important because there were semantic structures in that that specifically said, “These are the problems you’re seeing. This is what could cause these problems. This is a way to solve that problem.” Whereas before you would have had to specialize a task structure or create specific semantic structures using out of the box components in order to contain that kind of information previously.
GK: Yeah. If you have a case where you need to support some kind of structures and then you see that those are becoming available in the next version of DITA, that’s a really good time to kind of evaluate what you’ve got now and think about when it’s going to be the best time to make this move over into the new version and kind of clean up some things that had to be specialized before. Because I think one thing that we try to recommend is to only specialize as much as you need to and to use the out of the box features. Keeping an eye on what becomes available out of the box over time is a really smart thing to do and can definitely be the case for making some tweaks and updates to your existing DITA structure.
GK: Another thing that can kind of guide you along that path is if you get into a situation where you start to change your technology, so maybe you’re looking at a different content management system, you’re looking at maybe some different publishing outputs, you’re looking at new authoring tools, any or all of the above. And you have developed your existing DITA content model in ways that sort of aligned with the current tool set you have now. But now that you’re looking to change, then that’s a place to start evaluating is there anything in the DITA model that needs to change too, as we start to change software and technology?
JC: Yeah, I’m sure we’ve touched on this on the podcast in the past, but once you start looking at a proprietary tool, they probably handle things in a very specific way in order to achieve their goals. And that usually means that there may be some compromises or accommodations that need to be made in order to actually make that work. Some CCMSs we’ll use more of a database model for containing all of the different information that you want to have there. It kind of treats these individual elements and files as objects within a database. There may be something on the CCMS side that is equating with the ID attribute that you need on your DITA topics, but isn’t actually using that ID attribute that’s on those DITA topics. You may need to take a look and see if that might be something that’s locking you into that particular technology depending on what kind of move you want to make from there.
GK: Yeah, absolutely. And I think, especially if you did some sort of a specialization, if you did some sort of workarounds on your content model that were designed to sort of accommodate the current authoring workflow and content management and publishing workflow that you have and you realized that that has created a little bit of lock in with your current tools and you need to change, then I think that really presents a good opportunity to say, “Well, if we are going to make this change anyway, we really need to look at the DITA itself and figure out how that has to change too.” And then whether the latest and greatest version of DITA that’s out there, if you’re currently in 1.1 or 1.2, if going to 1.3 can kind of help make that change easier.
JC: Yeah. And when you’re looking at moving your content model over as well, you might want to take a look at, do you have any customized output, any custom DITA transforms that you’ve built around your particular specialization or any changes you’ve made to your content at the same time?
GK: Definitely. And there’s one other thing I want to touch on here too, which is that if you have any sort of new requirements that come up, so let’s say you have a new product that you need content to support or let’s say that you are extending something about the particular information or metadata that you capture around your existing products and you need to extend your DITA content model, especially if you have specialized it, then that can also be kind of a driving force to look at well, does the current version of DITA that we’re in support that? Or should we also just look at moving to the latest and greatest? Looking at going to DITA 1.3 at the same time? Will that help make anything easier if we’re kind of touching up an existing specialization?
JC: Yeah. And when you’re thinking about that as well, it’s important to identify what kind of gaps you’re currently seeing when you’re thinking about making that kind of move, because you should always start with this kind of gap analysis for want of a better phrase, to say, “What needs do we have that aren’t being served? And how can we rectify that? Can we make any of those kinds of changes of what we have now? Or do we need to move somewhere else?” And I feel like that’s really going to be a driving factor in, do we make this kind of big jump into a new version?
GK: Absolutely. What are some things to consider when you’re approaching a DITA remodel?
JC: I’ve always kind of said that when you’re building everything out the first time around, you can do as much user acceptance testing as you want, but the best user acceptance testing is going to be live testing. Even when you’re in production and you’re happy with what you’ve got and it’s working and it’s not posing a significant problem, it’s still a good idea to kind of keep taking the temperature on these things, to see, do we have any content authors who are running into problems? Are we running into any weird corner case issues with some of our broader content now that we’re actually out in production with this? Definitely, see if you aren’t being served by your content and what can you do to make sure that you’re getting everything that you need out of your content?
GK: Yeah, absolutely. I think that leads into a lot of the steps that we try to take with our clients when they come to us and say, “We need to restructure our DITA. We know that our content model isn’t working, but we’re not quite sure how to go about making that change.” One thing that we do and it kind of really helps to have those metrics, Jake, that you were talking about is, suggesting that the company evaluate which parts of the DITA structure still work? What should you keep? Which parts of the model are still going to be functional for you after you change? And then which parts are not serving you so well? And then you have the roadmap you need to start making a plan for how you’re going to change those things that aren’t working. And when you know that, that sort of makes it manageable because you’re not just kind of going, “Oh my gosh, we have to change everything.” You have a really specific plan for how you’re going to tackle that.
JC: Yeah. It’s not that unusual to kind of look back on the initial development process that you did with your specialized content or with the current model that you have and just kind of compare it to what you’re actually getting out of it. If you’ve been through this once before, you probably already have some sort of roadmap that says, “This is what we have done in the past,” and you can kind of use that to measure up your current state with where you thought you would have been.
GK: Yeah, absolutely. I think it is really important to learn from those lessons of the past if you have been through this before, because if you’re in DITA now, you did initially go through some path to get there, whether it was just starting in DITA or going from some sort of unstructured content to DITA. You do understand kind of what it takes to develop a content model in DITA, what it takes to understand the structural needs that you have and then how to take that forward. It does give you a baseline of lessons learned for what to do when you do this remodel.
GK: And I think one thing to really think about and be cautious about, which is something, Jake, that you touched on a little bit earlier is when we’re talking about designing specializations and content models around your tools, around your content development workflow, it’s really important to be careful about doing any sort of work arounds or specializations that are specific to a particular tool, because that does lead to a certain degree of lock-in and that’s something I think that you could avoid going forward if you’ve already done that once.
JC: And it’s also important to try and think about where some of this information is being stored. I know that metadata is kind of a weird squishy concept because metadata is information about data. It doesn’t have a lot of inherent meaning sometimes. It can be kind of hard to think about and it’s not unusual for some parts of metadata to be stored within the CCMS rather than stored within the source. Trying to think about what are you trying to do with your metadata structure when you build it out? Where are you going to store it? And how is it going to be used? And where is it going to be available? Is all really important when you’re thinking about tool selection and when you’re thinking about how to model your content.
GK: Yeah, absolutely. And I think I’ve seen in many instances with clients I’ve worked with that there’s kind of a hybrid where there will be some metadata that’s stored in the DITA content itself and other metadata that is managed and stored by the CCMS and by the tools. And so there’s kind of that balance there that you have to think about. And if you are doing a DITA remodel, that gives you an opportunity to revisit your taxonomy and to think about metadata beyond just what’s in the content itself? But how it’s used overall. And that’s where, we get back to this idea of gathering those metrics from your customers about how they’re using your content. You gather metrics from your authors about how they’re creating content and what are some of the roadblocks that metadata can help solve? And that can give you a lot of good information about how to approach metadata and how you might want to remodel that as part of your overall DITA restructure.
GK: Another thing to think about is the migration process. How is content going to be migrated from the current DITA structure you’re in to your new one? And are there any concerns around scripting and automation that can be addressed on the content model side to make that easier when you’re having to go through and rework the way that all of your DITA content is tagged?
JC: Yeah, it’s tricky when you’re looking at migrating into a newer version of DITA from an older version. By default, DITA is backwards compatible. Theoretically speaking, you could open up any file that was created in DITA 1.1 in something that’s using that DITA 1.3 definitions and it should open up just fine. And it most likely will. It just won’t be as fully featured. When you’re looking at moving from an older version of DITA to a newer version, just the baseline, the biggest thing you should be looking at is, what are we looking to get out of this migration? Is it just to get us to a new starting point so that moving forward, our content can be richer and take advantage of the features that are afforded by this new environment? Or are we looking to try and leverage some of those new features in existing content? In which case you really need to do an analysis of the why you’re moving and put together a plan for how you can fill those gaps.
GK: Yeah. And I think that’s especially important if you have done any sort of specialization that may not carry over so well to the latest version of DITA, or if it needs to be kind of completely reconfigured or restructured because DITA 1.3, for example, supports something that you build a specialization for before when that didn’t exist in DITA by default. That’s something to build into your analysis and into your plan, not just thinking about what does the new structure look like? But how are we going to move our content over? And sort of what are the priorities? What content needs to be re-tagged and restructured first?
JC: Yeah. And to bring it back just for a second to, we’ve specialized and we’re moving, did the newer version of DITA actually implement the thing you specialized already? We’ve actually seen in the past instances where there has been specialization and in migration to a new version of DITA found that not only did that new structure exist, it was also named the same. You need to kind of take a look at that and make sure if there is something new that exists that fills the role that you wanted to fill, would it be better to keep what you already have with your specialization? Or migrate your existing content over to that newer specialization?
GK: Yeah, absolutely. When you are developing your very first DITA content model, what are some safeguards that you can build in to avoid headaches if you do have to update it in the future? I know some companies are able to kind of stay with the same content model for a long time, but I do think it does become sort of inevitable that after years and years, you will probably need to or at least want to switch to whatever the latest and greatest DITA version is. And so, how can you set up your very first, your initial DITA content model to make that as smooth as possible and to kind of future proof it for later versions?
JC: The best advice that I could give is kind of what I’ve been hitting on as we’ve been going throughout is figuring out why you need to get this set up. And if you have a really good understanding of why you’re doing something, you’ll be able to better define what you can use out of the box or to identify the places where you’ll need to specialize. And if you have a really good understanding of the reasons that you’re doing this, you will most likely have an easier time of reacting if that needs to change later.
GK: Yeah, absolutely. There’s something that we’ve said in many of our podcasts before, but the more upfront planning that you do and the more you kind of analyze your specific needs around why your content model should be a certain way, that will really, really help make sure that you make the right decisions and that you kind of avoid things that will become pain points down the road. And I think in particular, when you’re looking at sort of broader information architecture decisions, things like your taxonomy and metadata, things like how your content is organized, how it’s structured, how it’s broken up into the different DITA topics and maps that you have, how reuse is set up, all of that. The more planning that you do upfront, the less chance that you’ll probably have of having to do major reorganization on that in addition to just updating your DITA version.
JC: Yeah. And also I kind of just to get into the specifics of it real quick, I deal a lot with actual DITA transformations, taking your DITA and turning it into something different. And the most compelling reason that I’ve found for specialization kind of boils on that side of things, boils down to two sides. It’s we need to make sure that our content gets treated in a specific way once it gets turned into a PDF or into HTML or whatever, you need a specific semantic structure to key off of in the transform so that it gets treated a particular way. Or, oh no, our SEO is bad. We need to make sure that our metadata is being handled properly. It’s not only knowing what you want out of your content model, but how your content model is going to deliver that for you once you actually start generating your content with it.
GK: Yeah. And that actually brings up a couple of points too that are sort of general advice. They may not apply to every company, but they are things that we tend to advise people to consider at least when it comes to sort of designing your content model. When you’re talking about the concerns around your output, one thing that we do really stress is to focus on semantic based tagging and not on tagging that is going to kind of help you get a lot of specific formatting based edge cases into your outputs. And again, that’s just because the entire point of DITA is to have your formatting separated from your content itself. Building things into the actual structure, building a bunch of specializations that are there just to address formatting concerns is generally not the greatest idea, especially if you think about how in the future things with your outputs may need to change alongside of your DITA version itself. That’s one thing that we really caution people about.
GK: And another thing is with regard to specialization itself, we tend to advise sticking to the DITA standard out of the box, as much as you can and only specializing when necessary. And again, that’s just because when you do specialize, if you need to make a change later, that’s always a little bit more difficult than just going kind of from an out of the box structure to another out of the box structure. That’s kind of just a little bit of advice we tend to give. Of course, there are always going to be exceptions. There are always going to be some companies that really do need heavy specialization, but we just advise that you try to keep that to semantic reasons rather than just doing it because you can or because you have some concerns around formatting.
JC: Yeah. When you’re looking at specializing like that, you really want it to be about making sure that your content is semantically rich. If you have a product that’s italicized because your style guide says, “Product names are italicized.” You don’t want to be wrapping that in just an i-tag, because that just means it’s italics. It doesn’t mean it’s that kind of content. And you want to try and find ways to enrich your content, not just make the content fit a guide.
GK: Yeah, absolutely. I think the biggest things to remember are just put semantics first, put structure first and do as much of that upfront planning as you can around the semantic needs that you have so that one day when you do have to go into the next generation of DITA, that can be done as seamlessly as possible. With that I think we’re going to wrap things up. Thank you so much, Jake.
JC: Yeah. Thanks for having me. It’s great to be here with you.
GK: And thank you for listening to the Content Strategy Experts podcast, brought to you by Scriptorium. For more information, visit scriptorium.com or check the show notes for relevant links.