AI needs content operations, too (webinar)
In this episode of our Let’s talk ContentOps! webinar series, Scriptorium CEO Sarah O’Keefe and special guest Megan Gilhooly, Sr. Director Self-Help and Content Strategy at Reltio, explore how to successfully integrate AI into your content operations. They discuss how to use AI as a tool, how to create content that an AI can successfully consume, and how the role of the writer will shift in a GenAI world.
In this webinar, you’ll learn
- The role of content in training the AI
- How semantic content drives chatbots
- How AI may change the way you write
- How to adapt in a GenAI world
Related links
- What’s the problem? Megan Gilhooly reframes the problem-solving questions we must ask to address content strategy issues.
- AI in the content lifecycle. Sarah O’Keefe shares expert insights on the predicted impact of AI in content operations and beyond.
- Our book, Content transformation, 2nd edition. Building an effective content strategy is no small task. This is your guidebook for getting started.
- Let’s talk ContentOps! webinar series on YouTube
Transcript
Christine Cuellar: Hey there, and welcome to our webinar, AI Needs Content Operations. This show is part of our Let’s Talk ContentOps webinar series, hosted by Sarah O’Keefe, the founder and CEO of Scriptorium. And today, our special guest is Megan Gilhooly of Reltio.
And we’re Scriptorium. We are content consultants, and we help you build strategies that make scalable, global and efficient content operations. So without further ado, let’s talk about AI and content operations. Sarah, I’m going to pass it over to you.
Sarah O’Keefe: Thanks, Christine. And Megan, welcome. Glad to have you here.
Megan Gilhooly: Thank you, thanks for having me. It’s good to see you both.
SO: Yeah, you too. For those of you who don’t know Megan, the key thing that you need to know about her is that, in addition to being a really interesting and really smart leader in this space, she is actually doing the work.
So a lot of people are talking about AI, and, “Blah, blah.” And, “This is what you should and should not do.” And et cetera. But Megan is going to actually talk to us about an AI enabled system in her organization at Reltio that has gone live in the last week, right?
MG: It went live on the doc portal in the last week [inaudible 00:03:14]
SO: It went live on the doc portal, which has a bunch of cool AI stuff going on. And so, she’s going to talk a little bit, hopefully a lot, about what that means, and what that looks like, and how it all works. So as I said, Megan is over at Reltio, where she’s covering technical product content and self-service for a data platform. She was at Zoomin as VP of customer experience at one point. That was also a content delivery platform, and a whole bunch of other stuff.
I’ve got this really great bio, and I’m sorry, I’m just accelerating right past it, because I’m so excited to get to the system that you wanted to talk about. And I wanted to start off by asking you about something that you said six months ago, give or take, and everybody was like, “Yeah, yeah, whatever. No, AI is great.” Six months ago, Megan says, “You know, data dump isn’t going to work with AI.” So tell us about that, because right now it looks as though you’re extremely ahead of the curve there with that comment.
MG: Right. So I think you and I had some very down and dirty conversations about what we foresaw in the future, but the idea that you can just dump a bunch of data into AI, and have it be accurate, precise, concise, helpful, is just kind of silly. As a content person, I recognize that. As a linguist, I recognize that. But I think a lot of people are starting to see it, and I’ve done a lot of learning over time. So the first thing I’d like to say is, when I made that comment, I was hoping that I would prove myself wrong.
I did not prove myself wrong. I think there’s a lot of learning that we’ve done, even this week. We’ve seen that decisions that we made in how we post our content have had negative consequences on how our AI responds to certain questions. And some of them are, we just haven’t updated the content, but others, we made very specific decisions about what we think a human would understand, but the AI sort of took a different angle, and so it’s changed the way that we have to do it.
So I think there should be nobody today that thinks you can just take a bunch of data, dump it into AI, not train it, not babysit it, and sort of move forward.
SO: But that’s exactly what people want to do, because it’s going to be free, and give me my easy button, and I’m just going to buy eight pounds of AI, and then I’ll never have to pay anybody again to do anything. It’s going to be great. So, no?
MG: No. Honestly, I have a group that I call the Self-Help and Content Leadership Huddle, and it’s a group of incredibly brilliant content and self-help leaders from various organizations, all the way from tiny little ones that you’ve never heard of, all the way up to Google and Meta.
And so, we have leaders, directors and above that get together. We have been talking about nothing but AI for the last six months, if not more, probably more like a year, and I’ve learned so much from that group as well. So I think there are people that understand it.
Certainly, there are people that think the opposite, which is just that we’ll just get an AI and get rid of the writers, which, obviously that’s not going to work. Where is the content going to come from? The AI can’t create it if it doesn’t know what to learn from. So there are so many things that we need to look at when thinking about AI, and content operations is definitely one of them.
SO: So it sounds as though, looking at these poll results, we asked people, “Is your organization using AI to support content creation?” And 7% said yes. So Megan, that’s your peer group right now, and there are a couple of variations of, “We don’t know how, we can’t get support.” Those two combined are about 20%. Nobody actually said, “No, we don’t want to.” Which I think is fascinating, but 71% of our poll respondents said, “We are working on it. We are working on using AI to support content creation.” So, given that you’re the 7% that has a working AI implementation out there, what is out there? What are some of the best practices that you have to employ in your content, and in your content operations, to enable this AI going forward?
MG: Sure. So I think that question specifically was about content creation. And so there’s two different ways that we’re using AI. One is, we’re using it to help us create content, as the poll question asks. The other is, we’re using it as a mechanism to push content out, so that our customers can consume our content more easily.
So when we’re talking about the content creation, right now, we use Heretto. And inside Heretto, they have this AI, and I don’t know if I’m supposed to talk about it yet, but I’m going to, so forgive me, Heretto, if I speak out of turn. We’re in a beta program right now. So we use their little AI called Etto, a super cute little dog that you click on, and it can help you do things like structure your content, double check the writing, the level of writing, the style of your writing, to a certain degree.
It’s not like you feed your style guide in, but it can tell you if it’s too wordy, or if there’s a better way to do it. It can tell you how to change a task topic to a concept topic or vice versa, things like that. So we are using an AI inside of our content creation tool that has become very, very helpful, and I’ll be sad when the beta ends. So we’ll hopefully keep on using that.
And then, in terms of how we output the content, we output it to what we call our Reltio intelligent assistant. And that Reltio intelligent assistant sits in two places. It sits inside our product, and it also now sits on the doc portal. Right now, we call it Ria, all that Ria does is it indexes the doc portal, and it provides answers based on what we have in documentation.
There are big plans to add to that very quickly. We’ll pull the knowledge based articles, we’ll pull in community articles. For the Ria that’s inside the product, it will go way beyond that, to do things that bring in the data that’s sitting in a customer’s tenant, to give them very personalized and very customized information.
The documentation portal won’t do that. We don’t have a login to our doc portal. Anyone listening to this right now can go to docs.reltio.com and they can see how Ria works. So yeah, so those are sort of the two ways we’re using AI. And I think, do we have a poll question also, more on the side of whether they’re using it for consumption for customers?
SO: You’re muted.
MG: You’re on mute.
Christine Cuellar: I apologize. Yeah, we have a question on if they have the support that they need for AI initiatives. And that poll question-
MG: Oh, so it’s kind of different. Okay. All right. So yeah. So when it comes to content creation, AI is super important, but it’s important to know that you can’t just hand it off to AI, and be like, “Okay, AI, do this, and then publish it.” Because it won’t get it right.
I was talking to one of my buddies here at Reltio, who is a super geek when it comes to ML. He’s got advanced degrees in computer science and linguistics, and he is just such an academic, and it’s fun to have really geeky discussions with him. And one of the things that he said, that I think was really powerful, is, “If anybody believes that AI is going to get it 100% right, they’re dreaming, it never will get 100%. Could we eventually get to 99.9%? Maybe. Are we there today? No, not even close.”
So I think that’s one of the big learnings, is that even though you can say that, and people logically understand, “Okay, it’s AI, it’s not going to get 100% right.” As soon as you push content out, if it’s not right, you’ll get an onslaught of feedback. “This isn’t right, this isn’t right, this isn’t right.” And people are really, really upset by it. So I think there’s sort of a sociological aspect, or a psychological aspect, that we also need to discuss when moving to AI.
SO: Well, when somebody tells you that this technology is far better than you are as a human, and it’s coming for your job, then it seems like the immediate response to that is, “But this is crap.” I mean, when it legitimately outputs not so good information.
So tell me a little bit about what does it take on the back end? What does it look like to write content, to create content that is going to be… I’m going to say AI compatible, that is going to successfully… That’s going to be fed into the AI machine and result in success for, in this case, your AI assistant, your chatbot that you’ve posted.
MG: Yeah, I think there’s some basics that I know, and then I’ll be the first to say that I’m learning every day, and so what I know today will be different than what I know in a week. I’ve learned three things in the last week that I didn’t know two weeks ago.
So I think one thing is, the more structured your content is, the better. Not from the standpoint of the AI, per se, but when your content is structured, there’s a discipline to it that makes it more likely that you’re going to catch these sort of weird connections or relationships that you didn’t intend to make.
So I think structure is one thing to look at. Does that mean that if your content is not structured, AI won’t work? No. Maybe you’re very disciplined but happen to not have structured content, that could be the case. But what I find, more often than not, is that bringing in structured content, or having really structured content, adds a discipline that AI loves and reacts well to.
Simple English. Using simple language obviously is better for humans, but it also is better for AI because, again, there’s fewer opportunities to sort of confuse the logic of the AI. And AI is very, very logical. These models, these large language models, are really just looking at probabilities of what’s the next right answer.
So my friend here at Reltio, the ML guy, Rob, he uses this example of, “An LLM, really, what it does is, it looks for, let’s say in a sentence of, “I went to the store to buy a carton of blank.” So an LM will take that, and it will make an assumption of what is the probability of the next word being eggs? Or the next word being milk? Or the next word being broccoli?”
So if you say a carton of, you’re not going to say broccoli. So an LLM sort of figures out based on probabilities. So if you’re feeding unintelligible content into your large language model, then you could see how it could mess up, because if you’re writing about cartons of broccoli, now all of a sudden, your LM is like, “Oh, well, it’s probably going to be broccoli.” So that’s kind of how things tend to mess up.
So I think simple language, clear and concise, structured content, these are all really good things that I think we’ve known for a long, long, long time.
SO: Yeah, that just sounds like best practice.
MG: Exactly. And these are things that, in tech writing, we’ve been doing for decades. So this is nothing new to tech writers, and it’s why I think documentation portals are really primed and ready to support AI, because they have some of these things.
Now, some of the additional learnings that I’ve had, I’m trying to think of all of them, because some of them are fairly nuanced. So for example, just yesterday, we recognize that where we said something about an update, the large language model converted that to yes, you can upgrade from one version to another. And so, when somebody asked, “How do I upgrade?” Whereas the answer should have been, “You cannot upgrade, and we have it clearly spelled out on other pages that you can’t upgrade from one to the other.” The way that they asked it, it looked, and it said, “Well, but the word update is there, so we’re going to just make up, here’s how you upgrade.”
And it completely made it up. There was nothing accurate about it. So there are little things like that, where you just don’t know which words are going to make sense to a human, but because they’re synonyms but they’re not quite the same, and contextually, I think humans understand it, the AI is not necessarily going to put that context around it, and it can start to make stuff up, based on exact words.
So we’re still trying to figure out, how do we teach this AI that update does not mean you naturally have an upgrade path? So there’s little things like that that I learn every week.
SO: And so, that sounds like there’s a huge amount of work to be done here, which, there’s a question from the audience that is, “When this does get to 99.9%, it will likely affect our jobs as tech writers and knowledge managers. So if we have a team of 10 now, will we still need all 10 later?” What’s your response to that?
MG: I would say you’ll still need 10 people. Whether or not we as tech writers will be doing exactly what we do today, that’s probably not the case. Same thing happened when we moved from books to HTML, or PDF to HTML. So we need to think about our jobs. Our jobs are still very, very important. What may go away, in my vision, what may go away is the channel, which today is the documentation portal. Let’s just say the documentation portal goes away, but we still need the writers to write the great content to feed the AI, so that the AI can spit out the right information.
No, to be clear, I don’t think documentation portals are going away anytime soon. I’m just saying that we need to change the way we think. We need to push our thinking to not assume that five years from today we’re doing exactly the same job as we are doing today. We didn’t do the exact same job 10 years ago, 15 years. Every five years, we change what we have to do.
So I understand the concern of, “Oh my gosh, it’s going to take my job.” One thing I’ve always told my teams, and I told my teams at Amazon this all the time, “If you work hard to work yourself out of a job, you’ll never be out of a job.” And I think that’s still an even more powerful statement today.
Because if you can figure out how to use AI in order to take away the sort of mundane parts of your job, to avoid having to hire 20 more writers as your company grows, that’s your bread and butter, that’s how you’re going to sort of move up in your organization. That’s how you’re going to go out and get the next best job.
SO: And so you’ve mentioned your AI and your ML people a couple of times, and it looks as though, based on this poll, we asked people, do you have the support you need for AI initiatives? And I think it’s fair to say that this is a resounding no, because basically 40% or thereabouts said, “We have an AI team, but we need help.” Another 40%, “Do you have the support you need?” Said, “Not even close.”
MG: Not even close. Yeah.
SO: And 23% said, “We’re relying on vendors to tell us what to do.” Yeah. Can you talk a little bit about your relationship? It sounds as though you’ve got some great support from your AI team, and what that looks like.
MG: Yes, absolutely. So we have an ML team that has grown a lot, because we have ML not only for content but to use ML within our product. And so, I feel very lucky to have some of the amazingly intelligent ML people that we have. The one in particular that I’ve spoken about, he has done some amazing work in ML. He will be the first to tell you he doesn’t have all the answers, and so, even having an ML team, he’s having to do the research, to look and see what’s going to work best in any given point.
I really think if you’re relying on vendors to tell you what to do, that can be a little scary, depending on the vendor, right? One thing I know, we rely on Heretto for the content creation side, and I know them very well, and so I trust them. They’re very sort of scrappy and innovative, and ready to kind of try anything, and so I rely on them as a partner, not so much just a vendor. But when I get emails, let’s say from someone who touts having the best AI ever, it’s kind of hard to believe that they have the best AI ever, because all of these guys use the same technologies.
And so, you’re going to have the same problems, no matter which way you go. That doesn’t mean don’t work with a vendor, but vet your vendors, make sure that you actually understand the difference between vendor A and vendor B if they’re just an AI vendor. What I would suggest, instead of going with an AI vendor, go with a vendor that can solve a problem. So I think the main purpose of AI right now should be around solving very specific problems.
So if your very specific problem is, let’s say it takes too long to do editorial reviews, and we only have one managing editor, and that person is a blocker, or a bottleneck for us getting content out the door. That is a very specific problem you could throw AI at, and you could probably pretty easily solve it. You would go with a very specific vendor on that. You wouldn’t necessarily just go with some AI vendor.
If the problem you’re trying to solve is, our search is good but not sufficient, which is part of what we have experienced, that search the old way, it was good two years ago, but now it is just no longer sufficient. And so bringing in AI to help our customers find exactly the right answer, or find the right content, that’s a problem that you can solve using AI.
If you just say, “I want to use AI.” And then you go out, that’s a solution waiting for a problem, right? You’re not going to be successful because you don’t know what the problem is you’re trying to solve. So I think having support within the organization is great. If you don’t have support within the organization and you have to go externally, it’s even more important to understand the problem you’re trying to solve, and then make sure you go with a vendor that can specifically solve that problem.
SO: Great. And I’ve got a couple of… I do want to talk about Ria, and what’s going on in there, but before we go there, we’ve got a couple of pretty specific questions that tie into what you’ve been talking about, so I’m going to throw those over to you. One here is, “It sounds more as AI would maybe replace an editor rather than a writer. Do you agree with that?”
MG: My answer to that is, I don’t know. It depends on the situation. Could it be the case at your organization? 100%, absolutely. That could be a thing, but the problem statement could be anything. It could be that you’re having a hard time structuring your content into appropriate data, in which case, you need the AI to actually work on structure, not necessarily editorial.
It could be that you need an AI to go in and change the product names of all of your products that recently changed names, or whatever. That’s why I say, find the problem statement, and determine how AI fits into that, as opposed to just saying, “Here’s what AI is going to do for us.”
SO: So, related to that, somebody is asking about the use of metadata. You talked a little bit about how to organize information and tag it, and make it more semantic and better, such that the AI can process it. And the question here is, “Would you say that metadata also helps the AI process your content?”
MG: The jury is out. So yes, I would say in general, you would think logically that metadata would help. Now, if we’re talking about metadata that’s put into data content, but your AI is reading off of HTML, then one of the problems I’ve seen is that when your AI is consuming the HTML, the metadata that came in as XML is no longer read.
So if you need metadata to help AI, then you need to set it up in a way where metadata will impact the AI. Some of that goes way beyond the technical skills that I have, but I can tell you that Rob, my buddy, Rob, would give you a dissertation on how to make this work, and what’s important and what’s not.
So yeah, I might not be the best person if they have really detailed technical questions about metadata, but I think just at a high level, if you need metadata to be consumed by the AI, make sure that you’re actually consuming the metadata by the AI.
SO: And not just putting it in and throwing it away. That just sounds sad. Okay. There’s a question here about your AI portal, essentially. “If your intelligence assistant is able to fetch the required information from the docs, then why is traditional search needed as well?”
MG: Yeah, so, you know what? That’s a great question, and I think there’s two schools of thought on this. Some would say that AI replaces search. I’ve had people internally at Reltio that say, “Oh, well, the goal here is to replace search.” And that might be the case I would say in five years. But at that point, why even have a doc portal? And that’s why I kind of go to that North Star, might be all we have is an app that answers questions.
Having said that, there are a certain number of people that… You know those people, when we left PDF behind, and they were like, “No, I want my PDF.” You’re still going to have those people. So right now, I don’t think you can completely replace it. And I think search does something that AI doesn’t, which is it gives you a bunch of different responses. It can give you that one best answer, which will be similar to the AI answer, and then it will give you a list of potential places you can look.
And so, I think depending on the scenario, there may be times when that makes more sense. Now, if you’re in retail, and your end users are consumers, and they’re trying to figure out how to, I don’t know, buy the right shoe, they probably don’t want information overload. But if you’re in an enterprise high-tech place, and your users are developers, they oftentimes will want to see all the potential options.
So I think you need to understand your user, and understand, “Can you get rid of search? Or is this something that you need to have both?” And we need to figure out the user experience that supports both.
SO: So, kind of an admin note on the polling. Right now, we’re asking people, “Does your organization have semantic content?” And a decent number have replied, “What is semantic content?” So maybe we can clean that up while the poll’s still open. So what’s semantic content?
MG: Yeah, so semantic content is really highly structured content that’s rich in tags, typically, I would say, done in XML, using data as the sort of format or language, but semantic content is really breaking down your content into the semantics of the whole. And so honestly, you probably have a better-
SO: Labels that have meaning, right?
MG: What’s that?
SO: Labels that have meaning. Instead of labeling something with, say, “Font size equals 12,” or, “Font size equals 18,” Which is a formatting instruction, you label it with, “Title,” or, “Heading one,” or-
MG: Or UI control.
SO: Or UI control.
MG: Yeah.
SO: So it is labels that tell people, when they’re looking at the content, what it is. Now, HTML can be somewhat semantic, but usually in HTML, we fall back on just sort of format labeling everything. So you have a button blue, not a-
MG: It’s italicized if you need to change. So an example would be, if you have, let’s say, product names. And you want all of your product names to be in bold, font size 12, which might be different than the rest of your font. So anytime that you have a product name… Now, going back into the old Word world, we used to just go through and mark it, and then either give it… I forget what we even called it, but you can give it an attribute, and then it will change.
But most often, what we did is we just bolded it, because it was a WYSIWYG, and we just went, “Oh, just bold it.” Today, we want to make sure that if that product name, all of a sudden we decide, no, we want it to be purple and flashing. We want to be able to very easily, on the output, say, “When you output this, make sure that anything tagged as product name is purple and flashing.”
We did that with API names. So we have a little thing at the end that actually says API. So we mark off, we tag API terms, so that it puts this little API notation on it. If we used it in a different setting, and we didn’t want that API notation, we could easily just change the output so that API was just like any normal text.
SO: And so semantic content, I mean, if you think about this from the AI’s point of view, from the machine’s point of view, if every time you refer to an API command it’s tagged with something like, “Hello, I am an API command,” then that helps the AI to go through there and distinguish all of those things which are blue and bold and whatever, from all the other things that are blue and bold. If all you do is make them blue and bold, then it may or may not have a way of distinguishing them.
MG: Right. Although keep in mind, this comes back to my comment on metadata, which is, if you’re training the AI on the HTML, and the HTML hasn’t brought in those tags of product title or API, or whatever it is, then you’re sort of missing your opportunity to utilize those.
So I will say, the easiest way to do AI is to just index the HTML, but then you lose a lot of that great tagging. So we’re looking at how to handle that right now, actually.
SO: Pretty interesting breakdown on this poll. I mean, basically a third are saying, “Yes, we have semantic content.” The other two thirds is broken down between 27% say no, 21% say I’m not sure, and 18% are still on, “What’s semantic content?” So my takeaway here is that two thirds probably do not have semantic content, or at least are maybe not aware of it. So, here we are.
MG: Or they may call it something different. I mean, I always think of it as highly structured content. You can say data content, if you use doc books, you’re probably using it. There’s not a lot of semantic content outside of the sort of structured XML world, I would say. You may be able to say otherwise, but I’m trying to think of what that would look like, if it was semantic content but not data or doc book, or some flavor thereof. Have you ever seen that?
SO: There are some other things out there. Particularly, you start seeing content that’s structured in something like a knowledge graph in order to render it through a headless CMS, or in order for it to be controlled through a headless CMS, and then put a rendering layer on top of that. So there’s an entire world of knowledge graphs, which I’m also pretty uncomfortable with. So we’ll put that in the bucket.
MG: I see knowledge graph as the opposite of structure. To me, a knowledge graph… When I think about structured data, relational data versus knowledge graph, the whole purpose of knowledge graph is to take unstructured data and create relationships. So I guess that does give semantic meaning without being structured, to a certain degree.
SO: Yeah, it’s a different approach. Okay, one more thing before we pop over to talk about your actual live implementation, and this is a question I have not seen previously, so I’ll be interested to see what your take on this is. There’s a question here about deprecating information.
“I suspect there will be some challenges, that’s like the understatement of the AI era, about deprecating information in an LLM. Have you come across scenarios that highlight this?” And I also want to thank the person who left this question, because I love it.
MG: Yes. Well, this is Micheal, and when I talked about that content leadership huddle, he is one of the leaders on that. So this is a very profound question, and I don’t remember if he was on the last one, but I’m pretty sure we talked about this on our last one.
So yes, I have a great example of that. So one of the things I’ve come to realize, we have had these what we call content refresh projects in the works for a year and a half. So we had roughly 20 content refresh projects. We’ve been able to finish about four, just given our capacity and all the other needs.
So we now have 16 content refresh projects that will include updating content, deprecating content that’s no longer valid, and just making sure that everything is fresh and accurate, those things that we have not done. Whereas we used to say, “Well, only X percent of people ever really hit it. So if nobody’s looking at it, we can just let it sit.”
Now, AI is putting a spotlight on it, because no matter what it’s about, somebody could ask a question that it could have trained on old content that needed to be deprecated, and now it’s giving an inaccurate response. So it really does, I think, AI is putting this huge spotlight on the importance of keeping your content fresh, making sure that you are changing the things that are inaccurate, making sure that you are not creating relationships about things that are inaccurate.
So there’s just all kinds of things that we used to sort of say, “Well, let’s prioritize deprecation a little bit lower, because nobody’s really looking at it anyway.” And now it becomes a, “Oh my gosh, we have to take care of that content.” So I think it really does support this need for more writers, not less. More people that can really validate the content, more people that can go through and refresh the content, to make sure that you have the right freshness, and you’re getting rid of the stale content. So really good question, Mal, thank you.
SO: And I have some big concerns about this in the context of moving people into semantic or structured content, because it is super common for us to look at migration, and say, “You know what? Everything that’s older than X amount of time, we’re not going to convert. We’re just going to take the existing probably PDFs, and leave them there, and not bother with the sort of uplift effort for that older content.”
But if people still need it, and they do, because keeping the PDFs, right? We’re not throwing them away, but it’s not going to be equally available to the LLMs, or to the processing, because it hasn’t been turned into semantic content. It’s just going to be sitting over here in a dumb PDF bucket.
Then what happens when I go into the AI and ask it questions about the older stuff, if I’ve sort of stratified my content into new things that I care about, and older things that I don’t need to care about as much, which was legitimate until about a year ago, now what?
MG: Yeah, and I know that there’s ways that you can train your AI to not look at content that’s more than a certain amount… Stale, let’s say. So for example, I could say, “Don’t index any of the content that is more than a year old.” That can lead to other problems. Now, in the case of PDFs, you could have those PDFs indexed, or you could decide not to, depending on how stale they are.
If you’re telling your AI not to look at anything beyond a certain date, the problem becomes, let’s say you have stale content, and then you find out that there’s a misspelling. So someone goes in and changes one word, now it’s considered “fresh”, even though it’s not technically fresh. So all of this needs to be thought about in your strategy. How important is AI to your content strategy? Because ultimately, that’s where this change occurs.
You’ve always had a content strategy where you’ve made assumptions, like if only three people per year are viewing this content, I’m not going to worry about it. Now this strategy changes, right? Oh, only three people are viewing it a year? Let’s get rid of it. That’s a new sort of goal that you’re going to have as part of your content strategy. So I think, really, AI is changing the way that we create our content strategy.
SO: Okay, so let’s talk about your portal, which I think is really the coolest thing going on here. Can you talk a little bit about what you did, and maybe the tech stack, if you’re comfortable with some of that? What does this thing look like? What is it?
MG: Yes. So there are a couple of things that we need to know. When I came to Reltio, we had a completely different tech stack, and that content would not have been ready for AI. So thankfully, we did the hard work upfront, which was, we went through the content, we brought over to a brand new portal, with Heretto, the right content, theoretically. The right content, the stuff we thought was the right content at the time. And so, we had a lot of our content that used to be in what I call fuzzy data or squishy data, and now it’s in real data.
And so, I think we did a lot of work on the content itself, ahead of all of this happening. So timing wise, that worked out really, really well. Last May, I had five different people from the organization, from different parts of the company, come to me and say, “Megan, I want to get access to your doc portal so we can start an AI, like a gen AI chatbot.”
And I was like, “Okay.” So after the first one, I was like, “Let’s think about this.” After the fifth one, I went, “Whoa, whoa, whoa. Okay, let’s come together.” Because if we have five different organizations within our company doing a similar thing in a different way, that’s just going to add complexity, it’s going to add inconsistency, it’s not going to be a good user experience.
So I brought that group of people together very organically. I just said, “Hey, guys. Come together. I don’t want to stop your innovation. That’s, I think, the main point. We never want to stop the innovation that’s going on within the company. At the same time, if we’re all doing a similar thing, let’s get together and do it once, and do it right.”
So we brought this group together. Rob was on that, and then a number of other people from either ML, support, training, docs, UX, product. We kind of had almost every single… I think we had every single function within the company represented at one point in time. That was a very chaotic group. We came together without knowing what we were going to do, without really understanding the problem statement.
So from that group, we developed the problem statement. We started to think bigger about the opportunities, and then from that, I wrote a PR FAQ, and a PR FAQ is a forward-looking press release. It’s sort of a fun way to show what you’re going to deliver in the future before you actually even start working on it. And so I wrote this PR FAQ that I then took to the product team. The product team added their sort of “think big” to it at one of our offsites, and then it kind of blew up from there.
We had a hackathon that added skills to it. So what started out as this, “Let’s just comb through the doc portal,” ended up becoming a plan to have AI inside the product that would both comb through the doc portal as well as do all of these things with data that currently take a data steward a long time, for example. So it sort of grew from there, but it took that sort of first vision to really get it out there. And so that’s why I think it’s so, so important to start with a vision, and think about the problems that you’re trying to solve.
Write those up. You can do a PR FAQ if you’re good at that. If you’re not good at that, honestly, I think you can go on to… you could probably go to ChatGPT and ask it, “Here’s all my notes. Write me a PR FAQ,” and it’ll probably do a pretty good first version. So that was sort of where it started.
We actually launched it into the product. When was that? In February. And so it was available to customers inside our product. So the tech stack that we use, we obviously are writing in Heretto. Our doc portal is also in Heretto, so we’ve had to work with Heretto, and we also use dialogue flow from Google. And so the two of those things sort of work together in order to bring up the responses.
It can be both good and bad to have separate vendors. And this is where I lean on my ML team, to say, “Okay, if this is happening, is that on Heretto or is that on Google? Or is that on our content?” And so, we have a lot of discussions about what is the cause. But yeah, so we have it inside the product, and now we’ve launched the exact same thing, pulling from the exact same Google dialogue flow project into the documentation portal.
So no matter where we’re accessing it, if we get issues, we can solve them, we solve them once, and it solves it in both places. Does that kind of cover it? I feel like I missed a part.
SO: I think so. You mentioned when we were planning this out, you mentioned that you limited the portal, or you limited the AI functionality intentionally. Can you talk about that a little bit?
MG: Yes. So when we think about AI, we’ve already talked about you’re not going to get 100% right, but if you try to boil the ocean, it’s going to be really hard to peel back the onion and figure out where it’s going awry. So we wanted a couple of things. First of all, we wanted the output of the AI to be very specific to Reltio. We don’t want it bringing in outside information that may or may not be true at Reltio. We don’t want it bringing in competitive information. We really wanted it to be trained on our corpus of content. So that was very, very important.
And we started with just the documentation portal. And, as I sort of alluded to earlier, even though you tell executives and stakeholders that, “You know what? It’s not going to be 100% right.” The minute something is wrong that doesn’t sit well with them, it’s like red alert.
In fact, I got a Slack message from a higher up in our organization this week, “Red alert, it gave this answer when it should have said no.” And I was like, “Okay, how is this a red alert? This is AI.” But it is, it’s that important for executives to see the right answers coming out. And so you have to be prepared for that. If you have multiple places where the content is coming from, it’s going to be hard to peel back what the cause of that is.
So I think it’s important to start small, get it right, and then you can add more and more and more, once you sort of know what you’re dealing with.
SO: Okay, so a couple of… I mean, I’m looking at my question list here, and these two are actually paired together. Oh, actually, sorry, let me start with a different one. There’s a question here about what part of the stack is connected to the LLM. Is it Heretto, Google dialogue, or both? Let’s start with that.
MG: Google Dialogflow is the part that really is serving as our LLM. So we have a stack inside Google that the ML team uses, but Google Dialogflow is the thing that’s sort of parsing through our doc portal and spitting out the answers, according to what it is learning.
SO: And now that I look at this, I have three questions here that all boil down to, “What is the role of DITA in an approach like this?” And let me run through them, and then I’ll let you address this. So one is, essentially, we’re shifting to writing for the AI system. If that system doesn’t require content formatted in DITA, will the need for that skill be necessary in the future? That’s one.
The second one says, “We’ve created and proven that knowledge graphs are much easier from structured DITA content than from unstructured content, and then the knowledge is used to do some other things downstream, so they’re not mutually exclusive.”
There’s a question here from a freelancer about, “At what point is it good for companies to more seriously look into DITA or semantic content?” So it sounds as though, collectively, what people are saying is, “Okay, your system is built on a DITA foundation, and is that a requirement? Is that going to go away?” What do you think? What’s the role of DITA in this AI world?
MG: Yeah, so kind of going back to what I said earlier, is structure leads to discipline, that I think is really important. So my friend Rob, he sort of agrees, and he understands that, technically, you don’t need DITA in order for AI to go in and find information, but he also agrees with me that there’s a certain discipline within DITA and other structured content that will help to ensure that you’re producing the right information.
If you always have the same level for a topic title, and you always have, I don’t know, three levels of headings, and the heading three is always of a certain nature, it’s just you’re less likely to get it wrong in a way that the AI will get it wrong. So I think there’s a discipline to DITA that is still super, super important. Having said that, five years from now, I don’t know if DITA content will be any better than any other content. I mean, we have no idea. Anyone who says they know is just making stuff up or hoping, because nobody knows.
I think there’s a lot of different ways to skin the cat today. So Michael brought up knowledge graphs, and that’s one of his favorite things to talk about. He’s also on my content huddle, so that’s awesome. I’ve got two content huddlers here. Yay. And so, he talks a lot about knowledge graphs, and I think that there’s a lot of logic that comes from what Michael is working on. So I’m not going to say that the way that I think it’s going to happen is 100% the way it’s going to happen, I do think that we all need to put on our vocally self-critical hats. We need to disconfirm our own beliefs. We need to almost start over in some of the assumptions that we make.
So we can’t just make the same assumptions five years from now that we made five years ago. So I think there will be shifts and there will be changes, and we need to be open to those. Having said that, I don’t think we need to jump really quick, and say, “Oh my God, DITA is not needed anymore.” Because there definitely is a benefit to DITA that you’re not going to get from anything semi-structured or unstructured.
SO: Yeah, so I was in a call on Monday with somebody, and he made the point that your unstructured content, and when we say unstructured, we’re talking about word files and HTML files, and generally these kinds of traditional documents. He made the point that the AI is actually really, really good at picking out the implied structure from those documents.
And so, if your content is “unstructured”, which is to say in a Word file, let’s say, or in HTML. However, that HTML is actually structured implicitly, the AI can deal with that, which leads us to the point that the reason we’re putting DITA in place is because when we can get away with not being structured, we are… Okay, speaking for myself, I’m not going to do the work if I’m not forced to do it. I’m just going to do literally the bare minimum.
And so, what DITA forces is a level of structure that it’s not impossible to do an unstructured content, it’s just really rare. So I thought that was an interesting point of view. But to your point, we’re just not disciplined when we’re not forced.
MG: Yes, exactly. Exactly. And you see that. I mean, when I show up in teams that have squishy data, or don’t have data at all, it always comes down to they just don’t have the discipline that’s necessary to be able to think through all of the various content types that they need. That’s really what it comes down to.
Now, even today, because DITA is an open standard, you could have AI actually structure all of your content. Keeping in mind that is going to get some portion of it wrong, because you’re putting in unstructured stuff, you’re missing stuff because of that, it’s going to fill in the blanks, and it’s going to just make up stuff for that.
So you have to go through it with a fine tooth comb. You have to understand, “Oh, okay, I see why it put that in.” So you kind of have to understand DITA to a certain degree, but five years from now, maybe you don’t need to understand DITA, you just need to understand that you have an AI that will structure your content in DITA for you. I don’t know, give me anything. Right?
SO: We see a lot of it in design files. A lot of people, they’re InDesign to PDF, and they’ve decided it’s time to move that content into a structured environment. And about 80% of the time, people say, “Yes, we have InDesign, but we have a template and we follow it. And our InDesign files are actually very organized and very structured, and why are you laughing at me? And very templatized.”
And so, okay, Megan, would you like to take a guess at what percentage of the time we have received InDesign files that are actually pretty structured and highly templatized?
MG: 2%.
SO: No, too high. So yeah, never. Never. People say, “Oh yeah, they’re pretty organized.” And then they’re like, “Oh, right, but oh, that file, that was Joe. Joe didn’t like templates. So Joe did his own special thing.” And it’s like, “Well, okay, but that counts.”
Okay, so we have six minutes. Tell us what we need to know in order to get into that elite 7% of “we are doing AI things”. Faced with what you were faced with six or eight months ago, give or take, somebody who’s being told, “You need to implement AI support.” Whether on the backend for authoring or the front end for delivery, what would be your key piece of advice to that person?
MG: So I think there’s a few pieces. So the first is, understand the problem statement. What are you trying to solve? If you’re just using AI to use AI and to be cool, then you’re going to fail, because you won’t know what success looks like. So understand what you’re trying to solve, have the data around it, and move forward based on the assumption of that.
Start small. Don’t try and boil the ocean. This content huddle that I have has been invaluable to me to just throw ideas around. Michael and I, we push each other to think differently. And so, I really appreciate having that group of professionals that’s sort of at my level, that is thinking about the same things, and can ask the right questions, and can really help me to learn. And so, start a content huddle of your own, right? Ours is closed, right? Don’t write to me and say, I want to be part of it. Ours is closed, but start one of your own.
Find professionals that you trust, and that you can have really candid conversations with, and then go have those conversations. This group is actually putting down some of the learnings that we’ve had over the last six months, and so hopefully, before too long, you’ll start to see a book or a white paper, or whatever format it takes, come out from this group that I think will be helpful.
But keep in mind that anything you read today, three months from now, could be outdated. So just keep thinking about it, keep talking about it everywhere you go. Have a conversation. If you know anyone in content, have conversations about AI. If you know any ML people, have conversations about AI. What’s possible, what’s not possible? How does it work?
There are a ton of videos, especially ones from Google, and then there are a couple others that will randomly come up if you start watching AI tutorials. But they’re very small snippets of information where you can learn about the temperature of an LLM, the throttling, the creativity, and looking at… I don’t know, they have all kinds of things, like what are the LLMs? What are the ones out there? What do they do? How are they different?
So just really consume a ton of information so that you go into it eyes wide open, and then set the expectations right off the bat. This thing is not going to be 100% accurate. So if anyone ever thinks it’s going to be 100% accurate, they’ll fail.
SO: Cool. Well, Megan, thank you so much. I really appreciate your time and your insights, and I think that, from what we can tell from the audience, I mean, people are struggling with this. And so hearing you talk about, “Hey, I did it, and I actually made it happen,” is great.
MG: And I still don’t know what I’m doing.
SO: That’s also reassuring.
MG: There you go. The more you know, the more you don’t know, right?
SO: Yeah. So I’m going to throw it back to Christine and thanks, and it’s great to see you.
MG: Thanks so much.
CC: And thank you all so much for joining today’s webinar. Please go ahead and give us a rating and some feedback in the menu below your screen. That would be really helpful for us. Be sure to also save the date for May 15th. That’s going to be our next webinar at 11:00 AM Eastern with Pam Noreault. And thank you so much for joining today. We really appreciate being here, and have a great rest of your day.