In episode 77 of The Content Strategy Experts podcast, Alan Pringle talks with Chris Hill of DCL about content reuse and what it looks like across different industries.
“You really have to start seeing content creation as a collaboration and build trust between the people who create content.”
Alan Pringle: Welcome to The Content Strategy Experts podcast brought to you by Scriptorium. Since 1997, Scriptorium has helped companies manage structure, organize, and distribute content in an efficient way. In this episode, we take a look at content reuse with special guest Chris Hill of DCL. Hi everybody. I am Alan Pringle. And today we have a guest on the podcast. It’s Chris Hill from DCL. Hi Chris.
Chris Hill: Hi Alan, good to talk to you.
AP: Yeah, it’s good to talk to you as well. Today, we are going to talk about content reuse and what that looks like across different industries. And the first thing I want to ask you, Chris, is why should people even care about reuse from say the executive who has departments that create and distribute content to the content creators themselves?
CH: Yeah, that’s a good question. And it’s one that’s evolved quite a lot over the last 20 years as we’ve moved more and more content to formats that support reuse really the critical things about content is there’s a cost to managing content regardless of how you do it and every piece of content you can think of as an expense. As you build up more and more content, the expense rises because you have more cost to manage it, to find it, to dig through it, to decide what’s relevant. And it slowly will build up to the point where it becomes daunting to deal with larger and larger volumes of content. So content reuse really came about to help control that.
CH: And when we see documentation that maybe has similar procedures or similar warnings or similar boiler plate text, whether it’s a copyright statement, you need to keep these things consistent. And so your users, your consumers of your content, benefits from reuse in that you create a consistency in the content that’s reliable, and that will not lead to confusion about what you’re trying to say. The creator themselves is often responsible for trying to deliver that quality consistent content to the users. And so a reuse oriented approach lends a great deal to be able to control and make sure that content is consistent and is accurate.
CH: If you have a lot of duplicated content and I find out that there’s a problem with that piece of content, or maybe something needs to be updated in that content. I suddenly am faced with a huge search task of digging through everything, to find where that content was used. If I’m using a real reuse strategy, that content should only appear once in the content. And so if I need to update it, it can be done so accurately by just going to the single source and knowing that it’s reflected in all of the places where that content might appear. So that’s from like a user and maybe a creator level. Now, sometimes management might say to themselves, well, I don’t really care. I’ll pay someone to do that work. It costs a lot maybe to move my content to a content management system. Why should I do that? I’ll just hire another person to do searches. And that is an approach that a lot of people take.
AP: But that’s almost like it’s the inverse of death of a thousand cuts. It’s this cumulative effect of all of this layer, upon layer, upon layer that you just keep throwing people at something where maybe technology might be a better solution.
CH: Exactly. And it might be fine to throw people at it for the first few years, but if you become successful or your product family grows, if you’re a product company or if you’re offering a service, maybe you expand your services. It’s slowly, like you said, that death by a thousand cuts, it slowly builds to this level where suddenly you’re overwhelmed with any kind of content update. And you can usually see that in organizations, because what you’ll find is that if the content is proving a drag on the agility of your organization, so if you say, okay, we’re going to release a new product or a new version of our product, but when will the user guide be updated? And if you’re finding that, that’s always way down the line or always a drag, there’s a good chance that there’s some things going wrong in there that reuse might be a part of the solution for.
AP: You mentioned the word control a little bit earlier, and that kind of stuck in my head because I have heard in the past from content creators, something along the lines of, “Well, my version of this stuff is better. So I’m just going to use my stuff.” How do you deal with that kind of mindset when you’re talking about a bigger picture reuse strategy?
CH: Yeah. That’s always a challenge. I think just about anyone who has a lot of pride in their work, whether you’re a writer or a programmer, I used to be a programmer and when somebody would say somebodies already written this piece of code, my initial instinct was, “Well, I don’t know how that code is. I think I’ll rewrite it.” Right?
AP: Exactly. Yeah, exactly.
CH: And I think content creators have similar pride in their work. And what’s important I think there is, you’ve got a couple of things that you have to address at the organizational level. You really have to start seeing content creation as a collaboration and build trust between the people who create content and make sure that they understand each other and what they can do for each other, because really rewriting a piece of content that’s perfectly acceptable really doesn’t benefit the user in a meaningful way a lot of times we might think we wrote it better the second time, but wouldn’t it be an even better solution if there is a problem with the existing content, if I rewrote that existing content or updated that existing content so that all of the documents and all of the content that I produce could reflect that improvement? Rewriting it myself for my own manual might make my own manual a little better than someone else’s if I’m writing manuals, but at the end of the day, really, it pays from a organizational perspective to make sure that everything is written to the best level we can.
AP: Sure. Now I know DCL works with a lot of different industries. Do you see kind of similar or different pain point struggles that organizations have based on a particular industry type when it comes to reuse?
CH: It’s really a lot of it overlaps. I look at a lot of different industries content and the errors are all the same in a general sense. For example, one of my customers makes a conveyor belts, right? For baggage handling. And I don’t know, stuff like that. And when I look at their manuals, I don’t know what half of it’s about, but I do see the same exact errors and the same exact inconsistencies in their content as say, somebody who’s writing a journal, maybe a medical journal or something, you’ll see inconsistent phrases. You’ll see maybe somebody refers to their product in a certain way. And another writer refers to it in a slightly different way. Both of those ways may be valid, but could lead to confusion on the part of the user when those pieces come together, whether that’s a product name or whether that’s a disease, I’ve seen that where medical journals refer to a disease with two different names in the same articles sometimes. You wonder about that and those things are things you need to look for because their areas where somebody less knowledgeable about the topic might be confused.
AP: Yeah, regardless of industry or content type, consistency, I’m assuming is something you really want to strive for regardless of where the content is coming from.
CH: Yeah, it’s all about clarity to the end user and whoever consumes the content, we always have to think, the person reading my content is not going to be generally as knowledgeable about me or as me about the subject. So as I write that, I have to really think in terms of somebody who’s just coming to this content or this subject for the first time, they need that consistency to help remove some of those hurdles in mastering the information. If you have a lot of inconsistency in the way you talk about or refer to things, I think that’s just one more hurdle in the way of me really understanding what you’re trying to tell me.
AP: And I think your point about thinking about the person who’s consuming, this content also addresses some of the ownership issues we were talking about earlier I don’t know if selfishness is the right word, but this idea, this content is mine. It’s really not yours. It’s the people who are reading it.
CH: That’s a great attitude to take. I think it’s a tough one sometimes.
AP: Oh absolutely, I agree.
CH: But it’s a great attitude. If you can get your organization there it’s really so much the better. I used to work in some more content creation jobs and one of the things I always tried to do in a meeting when we’d have a disagreement over content or how to write something or what to write about, I always tried to focus the discussion on the users, if you keep your focus there, I think that should be your North star as you’re trying to work through these issues.
AP: Sure and I can see it can also diffuse some tension when you’re talking about content that will eventually be shared or should be shared?
CH: True. That can sort of be the bridge between you and somebody that you may have some disagreements with.
AP: Well, speaking of disagreements, what are some of the horror stories you can share about reuse and going into an organization that you don’t have to name names of course, but what are some of the really kind of horrifying things that you’ve seen that you were able to help your clients fix?
CH: It’s really interesting when you go look at a lot of content and especially I think because I often am coming into it with not a great deal of subject matter expertise, because again, we work with so many different industries.
CH: I mean, what do I know about luggage conveyor belts? Or what do I know about medical procedures? Not a lot is the answer. But when I look at their content, I can really see things that oftentimes they’ll miss. They’re often surprised I will come in and I’ll look and I’ll say, “Well, you obviously copied this manual from this manual and then to this manual.” Because I can almost tell from the changes that, that’s, how they’re operating. And that’s often the case you’ll see, especially in a lot of manufacturing type companies is they do a lot of, we’re going to start by copying an existing close thing and then we’re just going to edit the parts we need to edit.
AP: I hear that all the time, all the time.
CH: Yeah, it seems to be an easy way to work I totally understand why you want to do that.
CH: And in the old days, that’s all you could do. I mean, you didn’t have a lot of reuse options back in the eighties and nineties, unless you were IBM or something.
CH: So it’s totally understandable that, that’s how you’ll work. And also that’s how we learned to work in our personal lives. None of us have set up content management systems in our home, as far as I know.
AP: I hope not.
CH: I don’t have my IT department downstairs, maintaining my files for me. So how do I work? Well, that’s literally how I work. I mean, I’m a reuse person and yet in my personal life, I’m not afraid to say that I will take a document that I’ve already written and revise it a little bit for some other purpose.
CH: But what happens if you do that at an organization level, is that those two duplicates then have their own life of their own. They’re really a split in the road and so if we find out there’s a problem with the content, most of the content you and I deal with personally, it doesn’t matter too much. If it’s out of sync a little bit, it’s like, “Okay, well, we’ll get over it.” But if I’m writing a space shuttle manual, or even a luggage conveyor belt manual, there are safety issues that come in. If I find out that there’s a safety problem and I’ve got to revise part of the documentation, if that documentation has been duplicated in dozens of places that I don’t know about, and I’m not very good at doing an exhaustive search, I may continue to expose my users to those incorrect or inconsistent pieces of information that could become a real liability.
AP: A legal, costly, financial liability.
CH: Absolutely. The other area a lot of times that this will come and I don’t know if this is a horror story, but is translation. So companies will start maybe in the U.S. or if there sometimes a Canadian company will start in a couple of languages like French and English, but they’ll start with a very narrow band of their user base. And if they achieve success, their user base expands. So if I take my company global, all of a sudden, I’ve got all kinds of other issues about my content and even if the content is in the product, any documentation I provide, there are laws in every country about how that gets delivered and what languages it gets delivered in.
CH: So I might be perfectly content using my copy and paste and starting from an existing document for my English speaking content, but suddenly I move into France and I have to add French to the mix, or I move into Germany and now Germany’s on the table and German language is on the table. And as I keep doing this, I think it quickly becomes evident that you can’t hope to manage not only copies of content, but then also the language variations of content very easily using that copying process as you go.
AP: Sure. Because once again, you have layers that are just exponentially increasing every time you do buy a new version of whatever you add a new language to the mix. So very rapidly, I can see it getting out of control.
CH: Yeah, it does. And that’s a big horror story in a lot of companies, a lot of companies will come to us to talk reuse because they are going international or they have gone international. And suddenly they’ve got this nightmare of stuff. As far as translation goes, it’s very expensive to do translation. And so if I have a manual, the first time I get it translated, there’s just kind of a fixed cost. I mean, all those words have to be turned into German or whatever. The next time I go back to that manual, if I have a way of doing reuse, I can break that manual up into parts and just keep track of what parts have changed instead of retranslating the whole manual a second time.
CH: And that can have a really dramatic effect on the cost and the velocity with which you can produce content internationally. If you can track and have a reuse strategy where only the reused components that get changed have to be retranslated that can often be very significant to an organization. So this is where the management starts to perk their ears up because they’ll start saying, how much money can we save on translation? Or how much faster can we get those translation done if we use this approach? And those are the areas that are often the real big pain points that an organization will come to us with.
AP: I know from past experience doing copying and pasting of my own, yes, I have done it it’s been a long time, but I have done it. That it’s easy to get content where they are sort of like near matches. It’s almost the same content, but a word or two is different. So from a reuse point of view, I mean, what kind of different matches are there? Because there’s got to be some variety in how you can identify and track them all the way from absolutely identical to fuzzy kind of the same.
CH: Yeah and that’s really where it gets very complicated if you are using that copy paste strategy. So if I take an existing manual and maybe I don’t like just the order of some of the phrases in the introduction, so I might move a couple of sentences around. Maybe I’m not really changing the meaning I haven’t really changed it much. I’m just aesthetically making some modifications because I like it better that way.
CH: Well, suddenly it’s very hard to do searches to find that stuff. If there was an error in say a paragraph and I need to go look for that paragraph everywhere that it’s been duplicated, it can be incredibly difficult to find that stuff. And so fuzzy matching is something that is very hard to do in a traditional tool. You can do wildcard searches, say in Windows, if you’re looking at a shared directory or in most content management systems, but they really have a hard time if maybe the meanings mostly the same, and maybe a lot of the words are the same, but they might be in different orders. It’s almost impossible for a regular person to write at an… You have to really get into regular expression writing. And even the experts on that can’t really address those fuzzy matches very well because there’s just so many variations.
AP: Right. And the people you’re talking about a lot of the time are content creators. They are not programmers.
AP: So they may not have that in depth knowledge of how to do regular expressions and other kinds of searches to really find that stuff.
CH: Yeah. And a lot of times what you see then to go back to our horror stories is… I’m amazed at how many organizations rely on, I always say one old guy, but it could be one person who is just intimately familiar with everything and that people go to and go, “We have to fix this.” And they’ll go, “Oh, this is in this, this and this manual.” And, “Oh, did you look there? Because it’s probably in there.” And that kind of reliance is very dangerous to an organization.
AP: I have seen exactly what you were talking about in a manufacturing firm in particular. Yes. I know exactly the type of person you’re talking about. He or she has been there forever knows where everything is, it’s this huge, fast domain knowledge that they’ve got tucked away in their heads, but they are usually approaching a retirement age. Very dangerous, indeed.
AP: So let’s kind of move beyond, we know the horror stories, we’ve got some ideas of how to fix them, but once you know that you’ve got reuse and you’ve identified it, what kind of things do you have to do to really get a return on investment? Because merely a dent, just identifying that reuse is probably not enough.
CH: Right. So the steps that follow generally, you’re going to have to find a framework or a platform on which to build a reuse strategy. So it generally is not possible or sufficient to just say, “I’m going to try to make reusable components on a file system.” There’s just too many limitations. So that’s when you start to get into the area of content management. And we’re kind of lucky today compared to say 15 years ago in that there are lots and lots of content management solutions out there that can support reuse and they’re better than they’ve been ever and there’s more options than ever. Some of them are cloud based and you can just get into them at a pretty reasonable monthly fee to start with and then build your way up if you need to. Or some of them are deployed content management systems that you bring into your organization and your IT department can manage if that’s your approach.
CH: But usually once you’ve identified the need for reuse, that’s the next stage of the conversation. And really the reason why you need to do the reuse analysis first, generally is these content management systems are not free. Some of them are quite expensive and depending on your needs, it may be worth making an investment like that. But to make that case, you really have to look at all of the ways it’s going to improve the organization. And at the heart of that tends to be reuse in a lot of the work I run into. So to able to go to your management and say, “I want this much a month in licenses.” Or, “I want this much to deploy some software solution.” To do that, you really have to come with some metrics. And one of the things, knowing where all the duplication is in your existing content that can really help you put together those metrics.
CH: You can put some estimated hourly or dollar figure costs on each piece of content and the changes you can talk about the time it took you to produce the next version of a manual or an updated version of a manual, sort of come up with some ballpark figures to work with as far as cost savings and efficiency improvements that these tools might have.
AP: I think that’s very important what you’ve pointed out, you are not going to have a lot of luck going up the management chain saying, “I need this.” Without showing ROI, you’ve got to have something that shows how that system is going to be paid for, it may be over a period of time, but you have to show how your return on investment is going to pay for this new technology. Otherwise, what’s the point?
CH: Exactly. Yeah. Yeah, I find that I’ve seen teams and I’ve been part of teams where we went to the management to ask for something and the only real thing we had at the end of the day, if you summed up our argument was it’ll make our lives easier. And I learned very early in my career, management doesn’t always care about making my life easier. They might even say, “I’d rather just give you a raise, than make your life easier.” Or “I’ll hire you some help.” Or whatever, because it can seem daunting these content management systems. And it’s like, why should we change everything that we’ve been doing for the last 40 years? We’ve been doing it this way I don’t understand why we need to change it, we’ll just get you some help. So making that bigger case and talking about some of these reuse issues and talking about content cost and velocity, those are all things you can start to put those numbers on so that you’re not just going to management with make my life easier.
CH: Right. You’ve got something fairly objective instead of all about you and how it will make your life easier.
AP: Yep. Yep. So for people who were thinking about reuse, are there some common places to start to look for duplicated information kind of low hanging fruit, if you will?
CH: Well, I mean, almost always you have copyright statements, right? And how many times do you find a copyright statement that’s out of date or inconsistent? It’s a lot and I can look at those usually and very quickly see, okay, there’s a reuse issue here. You may have, depending on what industry you’re in. We see a lot of background information, introductory paragraphs, those kinds of things often have a lot of overlapping subject matter. And it’s, again, you’re not looking for exact duplication all the time oftentimes you’re looking for the same objective for the piece of content. So maybe this content is to familiarize you with some process that our equipment performs. And so the background information might show up in several places and those are areas where that’s really easy to find.
CH: Another thing that’s easy to find is if you have product variations, you might have in the example of the conveyor belt company, there’s a straight conveyor belt, there’s a curved conveyor belt. There’s a conveyor belt that moves at a different speed. Those may have a lot of the same parts. They might almost be exactly the same, but they have different manuals. Usually, you know from your products where those things occur.
CH: One of the things we did, and this is really my role at DCL is I was hired on as product manager for one of the products we actually sell that looks for duplicated content. And it wasn’t originally a product so well before I joined, it was a part of the conversion process. When someone would come to DCL and say, we need to move our content out of word or out of Framemaker or out of whatever tool we’re using into this format, say DITA or XML that our content management system can use. One of the first steps is to say, “Well, where are those reusable pieces?” And we used to do that analysis by hand and throw armies at people at it.
CH: But nowadays over time they evolved a product to do that and that’s the product we call Harmonizer that I manage and that product has improved over the years because of all the breakthroughs in natural language processing and artificial intelligence and machine learning, all those fields have given us a lot of algorithms and a lot of approaches to find fuzzy matches, those near matches or even kind of matches of content that you’d never find by hand that a person would never see. I constantly see when we run a bunch of documents through this tool, it’ll find things where people have rearranged entire phrases into different orders and move the sentences around and it still picks it up as a near match to something else.
CH: And when you first look at it, if you were just scanning the page, you’d miss it because it doesn’t look anything like it at first glance and then when you read it, you’re like, “Oh, those are the same.” Somebody really rewrote this in maybe a better way, but didn’t rewrite the original one. So that’s where there are some tools now emerging that can help you do some of this stuff.
AP: Well, I think those sound very helpful and we’ll be sure and include a link in the show notes to the Harmonizer tool so people can learn more about it. And I think with those recommendations, we’re going to leave it at that. So, Chris, thank you very much. This was a great conversation.
CH: Yeah, I really enjoyed it. Thanks Alan.
AP: Thank you. Thank you for listening to The Content Strategy Experts podcast brought to you by Scriptorium. For more information, visit scriptorium.com or check the show notes for relevant links.