Skip to main content
January 21, 2025

Powering Conversational AI With Structured Content (webinar)

In this episode of our Let’s Talk ContentOps! webinar series, special guest Rahel Bailie, Content Solutions Director of Technically Write IT, and host Sarah O’Keefe, Founder & CEO of Scriptorium, discuss how organizations can leverage the unlikely connection between structured content and conversational AI.

In this webinar, attendees learn:

  • What is structured content, and how it fuels reliable conversational AI responses
  • How technical writers and conversation designers can collaborate for optimal output
  • Where to get started with structured content and conversational AI

Resources

LinkedIn

Transcript: 

Christine Cuellar: Hey, there, and welcome to the next episode of our Let’s Talk ContentOps webinar series. Today we’re going to be talking about powering conversational AI with structured content. This show is hosted by Sarah O’Keefe, the founder and CEO of Scriptorium. And today our special guest is Rahel Bailie, who’s the Content Solutions Director of Technically Write IT. Before I pass things over to Sarah and Rahel, I’m going to go through a few details about the BrightTALK platform just in case this is your first time. First things first, don’t worry. We don’t have access to your camera or your microphone. So we can’t see or hear you. Also, we are recording this show. And if you want to watch that recording later, you can do that at our YouTube channel at Scriptorium Publishing, or you can stay on this same URL and the recording will be showing up there later in BrightTALK. Also, a little bit about that menu below your viewing screen. On the left-hand side, there’s an ask a question tab. Do use that to ask your questions throughout the show, and we’re going to do our best to try to get to all of them, but I do recommend getting them in early just for time’s sake. Also, we do have a lot of other resources about today’s topic in the attachments section. So be sure you check that out before you go. That also has Rahel’s contact information on LinkedIn. So a lot of good resources there. Also, we do have a poll feature here in BrightTALK, and I’m going to go ahead and get our first poll question started right now. So if you can head to that tab, that would be awesome. Just keep an eye on that throughout the show because we will be asking questions and we’d love your feedback. And speaking of feedback, at the end of the show, I’m going to ask you for your feedback. You can give a star rating, you can leave a comment with what you think about how the show went or other topics you want to hear about. We really appreciate that. Also, we want to say a special thanks to our sponsor, Heretto. Heretto is an AI-enabled CCMS platform for deploying docs and dev portals. And so thank you, Heretto, for sponsoring the show. Lastly, I’m Christine. I’m the marketing coordinator for Scriptorium. And Scriptorium, we are content consultants who build strategies that help you build scalable, global, and efficient content operations. So speaking of content operations, without further ado, I’m going to pass things over to our presenters, Sarah and Rahel. Sarah, over to you.

Sarah O’Keefe: Thanks, Christine, and welcome aboard, everybody. Hey, Rahel. It’s great to see you.

Rahel Bailie: Hi. Good to see you too.

SO: Yes. So let’s jump in here. The first thing we wanted to start with was the question of… We’re going to talk about conversational AI and structured content. So I think what we’re going to have to do is define those two terms and then talk about how they interact. So step one, conversational AI. Rahel, please explain what is conversational AI?

RB: Sure. Conversational AI is the field of writing for chatbots, if you will. So it’s writing in a conversational way. And when you have a chatbot that is AI-enabled, of course, you have to take certain things into account. So conversational AI has become a subset of conversation design, which is writing for chatbots. So that’s it in a nutshell.

SO: Okay. And so then let’s make sure that we define our terms here. So structured content, what are we dealing with there? How does conversational AI tie into that?

RB: Okay. I think those are two separate questions. So the first question I’m going to talk about is what is structured content? So when people are talking about it on the editorial side, they think of putting things in a certain order like who, where, why, and how. For them, that’s structure. But when we’re talking about structured content, we’re talking about the structure that makes content processable and understandable by machines. So we’re talking semantically structured content. And this can come in a number of flavors, and I’m going to break it down into three general buckets. And so one is unstructured content. So unstructured content would be when you have, let’s say, content in Salesforce that the customer service people use. So they put some notes. And maybe those notes are very valuable, but they just put some notes in three paragraphs and it’s in a database cell and there’s no real structure to it. So that would be, to me, unstructured content. It’s readable, it’s usable, but you can’t really do much with the processing. Then there’s lightly structured content or semi-structured content. And think of it as lightly structured because this would be something like Microsoft Word, Google Docs, PowerPoint, where you have certain attributes that go on there. So you have things like H1, H2, or title and two-column text. So it does give you a certain amount of structure, so if you’ve ever dragged some PowerPoint slides into a new format. And if the person who created the original document had put enough thought into using the title field for the title and the two-column text actually for two columns, then when you drag it over, it takes on the new formatting seamlessly. So that would be lightly structured. But H1 and H2 and H3 don’t really mean anything to a machine because you could theoretically flip those around. And I’ve seen people do this where they say, “I want to call out. I like that look of the size of the text and the color of the text. So I’m going to use it for a call-out.” But it’s actually for an H1. And so now you have random H1s all over the page because those are for call-outs, actually. And if you don’t use them well, you can’t get the machine to process it properly. So that’s structured, but not semantically structured. And then you have highly structured content. So that usually refers to something with some semantics built in. So not only would it tell you that, “This is the title,” or an H1, but it’s an H1 of what kind of a topic. Is it a product? Is it a task? Is it a venue? What is it? So it tells you context or intent. And then it might tell you also, “This is the title for an instruction on an iPhone version 11.” So there are levels of semantics that you can add to make it quite specific. So when somebody does a search, they get that exact piece of content even though there may be mountains of content on that same topic elsewhere.

SO: Okay. And so when you make that distinction between lightly structured and highly structured, one of the things that we always fall back on, and this is a bit more of, I guess, a technology lens, additional to what you’re saying is that the highly structured tech stack allows you to enforce things. So to your example about H1s being scattered all over the place, in highly structured content, I can have an object or call it figure, and I can say, “If you put text inside this as a caption, it has to have a caption tag,” and I’m going to disallow the heading one tag or the H1 tag inside the figure. That’s just not a thing you’re allowed to do. And so that enforcement mechanism, which then means you have more predictable content, is, to me, also a big part of the structured content. So we have highly, lightly, and not at all structured content. And then how do you connect that to the chatbot conversational AI conversation?

RB: Okay. This is where it gets interesting. So we know that, just from our discussion just now, I can infer that the more context you have on your content, the more that machines can process it with accuracy and speed and so on, right? Now, if you think about how chatbots are used in a lot of cases is that there’s a big body of content. So let’s say you’ve got all of your product content. I have a whole bunch of mobile phones, and I’m not going to name a brand because, I mean, it applies to any range of cell phones, and we have a whole bunch of models, and we have this year’s model and last year’s model and the year before’s model. And now somebody wants to look up a piece of content from there. So like, “How do I do something with the camera or how do I do something with the setting on Bluetooth?” Or whatever it may be. And now you have this chatbot on the front that asks the question like, “What do you want to do?” And somebody puts in their query. And then there’s this mechanism by which it reaches into the repository and pulls out the content. Now, if you don’t have enough structure on your content, then it may be that what you get out is inaccurate or the chatbot doesn’t know which piece of content to pull out. So it pulls out as many as it thinks are relevant or it looks for a specific keyword. And if you don’t happen to use that keyword or a synonym that you’ve defined as a keyword, then it doesn’t know what to do with that. So a conversation designer can be trying to create a very good experience, but they don’t have the raw materials to work with. And the raw materials would be highly structured content. And that’s not to say that you have to structure all of your content. I’m not suggesting that you look at this mountain. It’s like your landfill of recycled materials that you’ve sent off to some other country. This is more the kind of like what’s important, what’s not important, or at least important, make a triage, and then decide, “Is it worth structuring some of the content so that we meet our business goals?” And a lot of times, the business goals are reducing calls to the call center. So if you’re looking at, “We need to reduce that,” then you say, “What are the top 100 questions? And let’s structure that content because now that structured content means that we are reducing the number of queries that can’t find the content, and then the people are calling the call center.” 

SO: Right. But the triage model is interesting. And I’m totally stealing the content landfill because we’ve seen so much of that. But we talk a lot about a puddle of content, right? Or a lake. There’s just all this stuff there, and you have no really logical way of pulling things out. 

RB: Sarah, I’m from Southern Ontario. We have inland seas.

SO: Okay.

RB: So Lake Ontario, Lake Erie, those are inland oceans, and that’s what we see.

SO: So when the lake freezes, if you freeze-

RB: Yes.

SO: … the lake, you can pull chunks of ice out of it that are pretty organized. But if you try and go in there with one of those ice gripper thingies, that-

RB: Yes.

SO: … only works if it’s frozen. And this analogy is going in a bad place. But I did want to touch on the poll because I think these results are a little bit surprising to me. So we asked, “Does your organization use conversational AI?” And the answer is about 40% said yes.

RB: Nice.

SO: About 25% said not yet, but soon. So that’s 65%. That’s two-thirds.

RB: That’s not surprising to me.

SO: The remainder is no, 28%, and, “I’m not sure,” it’s 7%. So you’re not surprised by this?

RB: I’m not surprised by this because we went through those stages where… So I started in content before there was the web. So the answer to everything was, “Let’s have a brochure.” And then once we moved to the web, then it was like, “Let’s have a website.” And then it was, “Let’s have a…,” whatever it might be. And then, at some stage, it was, “Let’s have a chatbot.” So everybody has a chatbot. Whether the chatbot works or not, that’s a whole other story. But everyone’s got a chatbot somewhere. If you are any size of organization, you’ve got a chatbot somewhere, and even smaller organizations. And I have stories, but I don’t know if they’re particularly pertinent to this one, but the one that is pertinent was when I was going to Reykjavik. And I had two bookings with this company, and you could only get in touch with them through their chatbot. That was like the first point of contact. And then if they couldn’t answer your question, you could go on through a person. And so, excuse me, I couldn’t remember how to spell Reykjavik. So I just put in Iceland because that was one of my reservations, and it said, “I’m sorry. You’re not allowed to swear on this platform.” “What?” So, of course, being the nosy parker that I am, I went and looked up Urban Dictionary “Iceland” and don’t go there. So even smaller organizations have some sort of a chatbot. So there’s some sort of conversational AI somewhere. But at the other end, there are places like, and I’m going to name these folks because they speak at conferences and so on, Lloyds Bank, where they have millions of queries a year, and they have a team of 100 people working on their chatbots. And I say chatbots because, well, it’s like one point of entry, but it branches out. And that doesn’t mean they’ve got 100 conversations designers, they’ve got data scientists and engineers and software developers and so on and designers and UX folks and so on working on it, but 100 people. And they went from… It’s not that great in their first iteration. And then they jumped by 10 points and 20 points of accuracy because they keep working on it and they keep iterating. And they are attacking the problem from all different sides. So it’s not just structure, but it’s also the taxonomy and the knowledge graph and the RAG model and all of these things that they’re doing to come together to improve the accuracy of their results. And they know that you can never get to 100% because people have complicated queries sometimes that just aren’t going to work through being answered by a chatbot, but they’re going to try to get as high as they can. And so one of the ways is looking at how AI-ready your content is. And it’s like a combination of editorial and technical factors. So that’s interesting because if you think about… And they don’t release numbers into how many millions of queries, but I’m going to pick a random number, 10 million. If those 10 million… 10 million is a lot. You have to have a lot of agents working at it to answer 10 million questions. So even if they can only answer 8 million out of the 10 million, that’s a lot of self-serve, instant answers and so on. And they do this ranking by, “Can you get it answered on the first go? Do you have to come back and attack it a couple of times? How many times do you have to take a run at the chatbot before you get an answer? Or can you not get it answered?” And so that measure, the way they look at the metrics is interesting too because it’s like how many people can just get it done, right? Get it done first time, go in, put in your question, it gives you the right answer and you can say, “Thank you very much,” and walk away satisfied?

SO: Yeah. And I mean, we talked about this last week, but I ran into a situation where I ordered something online. And I placed one order, it had three items. Well, two of the items showed up, obviously, in separate packages as they do, right? But the first two showed up, the third one didn’t show up. And so eventually, after waiting several weeks, I reached out via their chatbot and said, “Essentially, where’s my item number three?” But where’s item number three? Or I’m missing part of my order was not actually a choice that they gave me. They had one of these, you could only click on things like my order is missing, I want a refund, this thing is defective, that type of thing. 

RB: Yeah.

SO: There was no I got two out of three or I got a partial shipment, which, given that they’re shipping everything apparently separately, seems like a use case that they would be concerned with. But in any event, I had to get myself out of the automation system and to an agent who then said, “Clearly, you’re missing this one item. It probably got lost on the floor somewhere. We will ship you another one.” Great. Now, what you don’t know, Rahel, is that this story has a part two, which is that they reordered the thing for me and shipped it to me and it arrived. And then two days later, I got a second one because apparently, they found it on the floor. And now I have two of the thing that I actually only need one of, and I still haven’t quite figured out what to do with the second one.

RB: That has happened to me.

SO: But call deflection is interesting at scale. Now, you mentioned three other things that go into this that are, I think, related to, but not core structured content. And I want to touch on those. But before I do, I’ll tell you that the second poll is in there. We’re asking about semantic content, “Do you have semantically structured content?” And it looks 55% said yes, 22% said no, and then the rest are either, “What’s semantically structured content?” Or, “I don’t know.” So 20, 21% are saying some variant of, “I don’t know what you’re talking about.” So you said that in order to make a chatbot work, semantically structured content, this stuff that is tagged and marked up and consistent is a need. And then you said taxonomy. So talk a little bit about taxonomy. You said actually taxonomy, knowledge graphs, and RAGs. So I’m going to make you go through all of them. What is taxonomy? And why do we care in the context of conversational AI?

RB: Okay. So I’m going to use recipes as an example because everyone understands recipes and we’ve all cooked. Or if we don’t like to cook, we’ve cooked at some point in our life. So we know the kind of pain that goes with searching for these things. Now, if you want to say, “I want to make a Christmas dinner,” and maybe you’re new to this country or new to this culture and you want to make a Christmas dinner, so what goes into a Christmas dinner? So you can search for recipes with the word Christmas in it, and you’ll get Christmas pudding. But you won’t get roast Turkey, roast ham, Brussels sprouts, mashed potatoes, all the usual things, right? So how do you make that appear when you are looking for Christmas recipes and you’re not using a full-text search? Well, you categorize things. So you categorize things, it all comes down to metadata, right? So you don’t see the tag, but there’s a tag in there somewhere. And we’re all familiar with hashtags whether it’s on Instagram or we used to use them on Twitter. That statement, I realize, is very loaded. So we know about hashtags. So if we think of metadata or taxonomy as invisible hashtags, but that comes into… It’s a categorization. So think of a folder structure where you’ve got subfolders and subs of folders. So it’s very organized. And so you can have a taxonomy of people, a taxonomy of foods, a taxonomy of anything. So in recipes, you might have a taxonomy that says, “This particular, it’s a breakfast food or it’s a soup or it’s a dinner food and it’s a soup. And the soup is an appetizer, and we usually eat this in the fall. Or this is a recipe that’s good for bulk cooking, or you can cook it on the stove or in the oven.” So you can layer all these categories on there and then people can search by one or more categories. So a taxonomy is really just categorizing things and then attaching those tags to particular pieces of content.

SO: Yeah. And so to your point about Christmas dinners, that might be in a category called holidays. So you could search on holiday and get all sorts of different holidays, or you could search specifically on Christmas, which is like a subset of holiday. For most of us, I think the taxonomy that we’re most familiar with is from high school, probably biology, where you learned about classes and orders and family and kingdom and phylum. And I’ve got them in the wrong order, right? But that is a formalized taxonomy with that sort of hierarchy of classifications that go from pretty broad to more and more and more and more and more specific. So that’s taxonomy. Now, let’s talk about knowledge graphs.

RB: Okay. So knowledge graph is… And if any of my semantic professionals are listening, they’re going to probably cringe at this explanation.

SO: Yeah. Just close your ears. 

RB: Don’t come at me. So there’s ontology. So ontologies are multiple views of a taxonomy. And Theresa Wrigley once explained it beautifully. She said, “If you have a taxonomy of foods and you have lettuce in there, and then you have a taxonomy of growing conditions and there’s the growing condition for lettuce, it’s not like lettuce is two separate things. It’s one thing.” So lettuce becomes the pivot point for those two things. And so you’ve got an ontology, and an knowledge graph is an instance of an ontology. So it’s all the relationships, it’s all the categorization, but then relationships to each other. So you talked about holidays. So you could have holidays and bulk cooking, but sometimes bulk cooking isn’t for holidays. So it’s a way of disambiguating and it’s a way of making… We think of it as enrichment, but at the same time disambiguation. So one example is that there are three people named David Platt in the public eye. And one is the UK football player, one is an American software developer and author and he wrote the book, Why Software Sucks…and What You Can Do About It, and then the third one is a fictional character on Coronation Street. So if you put in David Platt, you’re going to get all three results. If you put in David Platt US, then you’ll get the software author. But if you put in David Platt UK, you still could get one of two. So if there’s some reference to sports or some reference to soap operas, you’re going to get the right David Platt because they know there’s some sort of a graph in the back, this is why we call it a knowledge graph, that connects things up. So think of it as a mind map almost, but very complicated one. So we’ve got that same concept in just about anything we do in business. And if you’re in a relatively large organization, you’re probably going to have multiple products and different aspects of products. And is it a troubleshooting guide or a release note or who knows what, a maintenance guide? And it’s going to be for various products and different versions of products and maybe products that are available in certain countries, and maybe it’s in a different language, and so on. So it can get quite complicated.

SO:  And we do have a basic, basic article, which I would also encourage the ontologists to not read, which we’ll include in the footnotes on knowledge graphs. So then you said RAG, retrieval-augmented generation.

RB: Yes. So retrieval-augmented generation is… So there are three words there, and they each mean something. And once you string them together, they mean something bigger. So generation is generating a query, so a query response in the chat interface. So if you think about the generative AI, it mimics human language. It’s being used as a search engine, but it’s not really a search engine. It’s a way to mimic human language. So somebody says, “How do I fix my glasses?” And then it goes and it finds a response and it’ll be like, “I understand you want to fix your glasses. Which part of your glasses are broken?” Something like that. So it’s this query response in the chat interface. Then the augmentation is the pointer to some sort of restricted source. So like a particular repository or a particular source of content. And it doesn’t have to be one source. It can be multiple sources. But basically, you’re restricting that source. So you’re doing this… And I don’t know why they call it augmentation, but it’s this way. I think the augmentation is the knowledge graph. So it uses a combination of the source content plus the knowledge graph to find the right piece of content. And then the retrieval is it pulls it out and it presents it to you. “So what is my baggage allowance on this airline?” So it’s only going to look at its own baggage rules of all the airlines. It’ll be, “This is our knowledge base. This is where our information is.” The RAG model will point not only to there, but it will know what you’re talking about because you’ve said the word baggage. And they might assume that you mean carry-on. So it goes into carry-on or check bags. It goes in and finds the right article and then it presents it to you. So that’s RAG, and that’s very basic. There have been some articles. If you follow Michael Iantosca on LinkedIn, he writes about this stuff extensively. And there are various people. There is also Teodora Petkova who writes about all things semantic. So those two folks can give you a post-graduate certificate in that topic right there.

SO:  Okay. So we’ve talked about conversational AI and the idea that we can feed it content and that we’re going to get better results if we feed it semantically structured content. And now what I hear you saying, and I’m not saying I disagree, right? But now what I hear you saying is, “And you also need a classification system, a taxonomy. You need knowledge graphs underlying all of this, and you need retrieval-augmented generation to essentially provide the guardrails so that the generated content doesn’t just go off into some really incorrect and problematic things.” But, Rahel, this sounds very expensive, and everybody’s running into AI because their position is more or less the AI can do it, and I don’t have to do any work. So what you’re describing sounds like work. So why can’t the AI just do it?

RB: Yeah. So there’s that idea that you sprinkle a little bit of AI magic fairy dust on your content and it’s going to magically do everything and you can sit back. And CEOs love this because they just salivated the idea of firing all the writers. And we’re already seeing some walkbacks on that where they had laid off all their content designers and now are bringing them back. So it’s as expensive as you need it to be to get the results you want. So you have to do a cost benefit analysis. If you’re going to invest $100,000 in doing X, Y, Z with the AI and structured content, and you’re going to improve $30,000 worth, you’d have a hard time selling it to your management. But if you are looking at, “Hey, we’re going to do some sort of an analysis and we are going to really dig deep and we’re going to find out what can we do with our existing content” And the existing content could be already lightly structured and you could say, “Let’s run some experiments and let’s figure out if our content is… Let’s call it AI readiness because that’s what our company is looking at in terms of what we offer to clients is, ‘Let’s help get your content AI-ready.'” And so AI-ready could mean a lot of things depending on what you want to do and the results that you need to get. So if you are in a regulated industry, you’re probably going to want to lean towards the more conservative side, say, “We’re going to make that investment. We’re going to structure this because it’s really important that we get out exactly the right thing.” And then there are going to be others where they go, “You know what? If it gets it right most of the time, it’s not going to-“

SO: Make or break.

RB: Yeah. “It won’t make or break. Nobody’s going to die.” It might mean that… And I’m thinking of like a hotel rental or Airbnb, that kind of thing where it’s like, “So it’s going to overlook a few rooms, but it’s not quite the business result we want to get, but nobody’s going to die.” Whereas if you’re a medical device company, you might go, “We really want to make sure that there’s accuracy around things like sterilization and maintenance of the machine and things that could cause patient danger.” So on this continuum, you have to do that analysis and then you say, “Actually, the content the way it is, just fine.” Or, “We are getting good results over here, but not over here. What would it take to structure it? Can we structure it at authoring? Right? Can we do some bulk structuring, like run it through a data conversion process and get the 80/20 rule and clean up the other 20% and then that’s done? Or can we do the structuring on the fly using some sort of the AI chunks, the content, and so on? But that has some limitations to it.” So it’s a case-by-case basis. You have to figure out what’s going to work best. Now, I’m not going talk about this organization. I’ll just say that they’ve got thousands of SharePoint sites.

SO: Yes.

RB: And so if you take… And I’m going to do a hypothetical. You’re onboarding and new salesperson and you say, “Go look in the folder where all the sales presentations are, and you’ll see our typical sales presentation and there’s a template there.” Now, what will have happened over the years will be they take the template, they add a few things, they change the client logo, and then they save it as another version. And then this happens 200, 300 times. So when the person goes to see, “I want to see a sales presentation,” they will get 200 correct results. Well, that’s not really helpful, right? So how do you do that? All the structure in the world isn’t going to help your accuracy unless you start curating. So there’s the curation part on the editorial side. And do you need to keep all of those? Or can you get rid of them? Can you archive them? Can you exclude them from the indexing? Can you use AI to choose either the latest one or the one with the most word count or whatever you’re going to look at? So you have to have some sort of criteria on how you’re going to go about getting the results you want and making it worth your while to get the business goals you want. So if you say, “I’ve calculated that we have 300 salespeople and they waste 15 minutes a day or an hour a day. And so now let’s multiply this out to a year.” You can come up with some shocking results and say, “Actually, it’s worth it if we don’t have to increase the number of salespeople or they have more time to actually be selling instead of rooting around through SharePoint for the right thing, the right sales deck, then it’s worth it.” Right? So really, you have to do a cost-benefit analysis, I think, is the bottom line. That was a long-winded way of explaining.

SO: You need a business case. And the AI can’t… I mean, the thing is people now are saying, “Well, just wait and AI will do it,” right? And I think-

RB: Maybe.

SO: And maybe will.

RB: Maybe five years from now. And do you want to wait five years?

SO: That is the question, right? Can you wait for it to get better? So first of all, for those of you on the call, if you have questions, start dropping those in because we will try and take some questions towards the end of this show. Second, we will not be providing the Urban Dictionary definitions of anything that Rahel has referred to. But if you want to go there, you are on your own. And then I wanted to talk about requirements for what does it look like to do a successful conversational AI project? You’ve talked already a little bit about curation and some of the other technologies that you can attach to that, like taxonomy and retrieval-augmented generation and knowledge graphs. What does it look like to build one of these? And what does it look like to look at the content itself and start to think about how to make it successful in a… And we are going to use the AI to retrieve this content context.

RB: Okay. So if you are going to work on this, there are four stages. So one is you design and build the conversational AI. And that’s like building the foundation of the system, the structure, the UX, language capabilities for global markets, and so on. Then you need to do the testing. So you have to test it for accuracy, for efficiency, for the appropriateness across the use cases. So we didn’t really talk about use cases, but we’ve all needed to do them for various things. So just apply that to this scenario. You test the structures, the languages, and your domains, then you deploy it. So you’re deploying it once you launch, and then you look at how you integrate it with various systems and then you refine it. So I think refining is a continuous activity. So it’s never a one and done, right? So there’s always something that you have to keep looking at and keep refining. And for this, it means that you need this strong collaboration across skill sets. So if you’re going to do structured content, you’ll need some technical writers who understand how to author and curate content to be semantically structured. You’re going to need some sort of a knowledge graph engineer and they’re going to develop the knowledge graph and probably the RAG model. You’ll need probably some data scientists and analysts and they’re going to do the modeling, building, and testing of the AI software. And they might double as the person who works on the knowledge graph. Don’t know. Then you’ll have conversation designers and they’re going to create the access to the chatbot and they’re going to be in charge of the whole overall UX of the chatbot. And then you’ll have some sort of technical solutions architect and they’re going to train and fine-tune the LLM. And then you need ethics and compliance officers because you have to validate that the content complies with regulations. And that’s very important this year, particularly with the EU AI Act. And there are other acts that we can talk about and directives and so on. And then you’ll need some sort of a project manager who’s going to coordinate these cross-discipline teams and schedules and so on. So I would say those are the core skills that you need to work together to make this happen.

SO: Yeah. And I did want to touch on… You’re based in the EU. I’m based in the US. What is going on in the European Union with regard to AI regulation?

RB: Okay. So there are five sets of regulations that, I think, really apply in this case. And even though I am talking about the EU regulations, there are similar regulations either in force or coming into force in Canada, the US, Australia. So I only looked at the English-speaking countries because I speak English, basically. So basically, every country is starting to work on this. So there’s the EU AI Act. And that Act says that your AI has to fit certain risk levels. And there’s a high risk and a medium high risk. So let me just-

SO: Well, I know anything related to medical is considered high risk or humans. And-

RB: Yes. So the EU AI Act is saying that AI has to be safe, transparent, traceable, non-discriminatory, environmentally friendly, and overseen by people to prevent harm. So there’s like unacceptable risk is behavior manipulation, social scoring, social profiling, or collection of biometrics. Not allowed to do that at all. And then there’s high risk, and that includes products under various EU safety regulations or AI systems in specific areas like education, employment, law enforcement, migration, law, and so on. So that’s one side of it. And the other side is that you have to declare that AI is being used. And there have already been a couple of lawsuits actually in the US where they didn’t declare that it was an AI system that they were interacting with. They pretended it was a human, and they lost that lawsuit. But also, you have to document anytime you’re using AI. So you have to document the AI, and you have to document that even if you’re not creating the AI, you have to document that you’re using the AI. And so there’s a lot of documentation that nobody’s ever really paid attention to because with Agile, it was all, “We don’t need documentation.” And that was the interpretation of it. Now, it’s like, “No, you have to document it. So in effect, it’s turning us all into AI. We’re all affected by this regulation because if you think about it, everything now has AI built in, right? There’s Microsoft Copilot. You might use Grammarly. It’s like those all have AI. You use Otter.ai. You use AI within Teams. So if you’re producing a product and there’s AI involved anywhere along the line, you have to think about this. And do you comply? So that’s the EU AI Act. And there are other countries that are developing them. And they don’t have them in place yet, but they’re working on them. So it’s something you have to just look at your local government and see how that is. So even if you are in another country, but your product is used in the EU, then it affects you. So that’s another thing to keep in mind. The second set of regulations that’s going to make this interesting is the Right to Repair Directive. So the Right to Repair Directive says consumer goods have to be a repairable even after the warranty has ended. So the manufacturer has to provide access to repair information, to tools and spare parts, and it’s encouraging people to repair what they have instead of throwing it away. So Apple has been one of the worst offenders in that they have done everything they can to not let people repair. And in fact, they created a particular type of screw that there was no screwdriver for so that you couldn’t remove the screw from the phone. Or it was the laptop. I can’t remember. And then there’s… Kyle Wiens, what’s his…

SO: iFixit.

RB: iFixit, yeah. So they went out and manufactured a screwdriver so that people could do that. So it’s just this ongoing thing. So you have to do this for 10 years. So you have to keep 10 years worth of maintenance and repair and troubleshooting information for people to be able to repair their stuff. So you can imagine, after a few years, how much content you’re going to have. And if you want to serve that up automatically through a chatbot, it’s going to be like going through this landfill, right? There’s a similar thing for medical, and it’s called MDR. So it’s medical device repair something. And basically, it’s the same, but for 15 years and it’s for any medical devices. So I think this was intended so that… You know how companies are going out of business, and then people are finding that they have these now deteriorating bits of metal in their bodies. So now you have to be able to repair them for a period of 15 years. So that’s another thing to keep in mind. So there’s that one. And then you’ve got the EU Accessibility Act and the EU plain language regulations. And we know what those are, like the Accessibility Act. You have to make your information accessible to all people, not just a subset of people. And this includes to the intellectually disabled. And so if you have government, not-for-profit services and consumer goods, then everyone has to be able to understand. You can’t hide contractual loopholes by inflating the language and making it obscure. So you’ve got that. And then plain language, again, that goes hand in hand because that means keeping the language very clear and plain and making it accessible to people. So when you take those into account, it really does cover a lot of organizations no matter where you’re in the world.

SO: Yeah. It does seem as though a lot of the regulations are in direct conflict with the sort of YOLO just throw AI at it that a lot of large well-known organizations are taking to their AI strategy.

RB: I was at a conference last year and I heard this VP of… I think he was knowledge management, and he was like, “We fired all of our translators and AI is doing it all.” And I think they were a pharmaceutical company, actually. And I just went, “They’re in the FO stage and now they’re going to FA… No, they’re in the FA stage-“

SO: No, the other way.

RB: “… and soon they’re going to FO and I’m going to be there with popcorn on the side because when it comes to pharmaceuticals, you’re not supposed to mess around.”

SO: We’re just full of Urban Dictionary references today. So a couple of questions in our very, very small amount of time. One is, are there any studies… And I sort of think the answer to this is no, but maybe you have a better idea. Are there any studies that show how much better or improved chatbot queries are when using an unstructured content repository versus a structured content repository?

RB: I don’t know of any academic papers that have been done yet on it. So everything that I’ve seen has been presentations at a conference. And so that’s not necessarily academic. But because I have access to academic databases, so I can look around and see if I can find any. I think it’s still early on, but-

SO: My sense is that people are doing this work and doing the studies, but they’re not publishing. So Rahel’s example of the millions and millions of queries, they’re definitely looking at that and I think they have internal metrics.

RB: Yep.

SO: I’ve spoken to a couple of people on our podcast and also on this series who did have some in industry information about the investments they’re making and how they’re justifying them, but I don’t think we have exactly what this question is looking for. It’s unfair, but can you touch very briefly on bias and discrimination in AI? And then I want to ask you about jobs because that’s the thing people really care about, but bias. Say-

RB: So bias. This is one area that is near and dear to me. So there’s bias. Your LLM or your large language model, which is the basis of your AI, is only as good as the data it has been trained on. So we know that there’s a lot of, for example, sexism where if you ask for a picture of a doctor, it will always show you a male or it will always talk about doctors as males and nurses as females. And somebody tried to generate an image of a woman doctor and it gave them a male doctor with breasts. So there’s quite a strong bias. It’s also there’s a racial bias, and that’s because it’s been trained on biased data. And there’s job biases and educational biases. And somebody had even said that they ran an experiment where if you’re on a Zoom call with a recruiter and you have a bookshelf behind you, then you get ranked higher than if you have a plain wall behind you. So there are lots of things that we are just oblivious to because we don’t know that they exist, but they’re there. So you have to always check. And this goes into ethics. So I think AI ethics is so huge, and nobody wants to spend that money because it’s just ethics. But it’s so important because that’s what is going to trip somebody up and get them sued, right? So if your organization is all worried about risk management, then you have to think about not just where the biases might be, but then the ethics of doing things in a certain way and how to correct the bias. So that’s what I would say is my very short answer.

SO: Yeah. So maybe that’s an entire other hour-long discussion. I did want to wrap up with one last question, which is around jobs and careers. What’s your sort of big picture advice for people that are maybe just coming into content and content creation in the content industry as we’re dealing with AI coming in and being this new transformative thing? I mean, what people really want to know is, are they going to lose their jobs? “Am I going to lose my job? What is my job going to look like?” What do you think?

RB: Oh, goodness. That’s such a loaded question because number one, everybody’s trying to get rid of headcount. I just read yesterday that there are a couple of big companies, very, very big companies, I can’t remember which ones, but they’re saying that they’re no longer going to hire mid-level software developers because AI is going to do a lot of their job. So it’s like, “So how do you get to be a senior if you can never be a mid-level Developer?” And we’ve been seeing this already in content. And as I said, there were these mass layoffs in content design and in writing because AI is going to do it. And then they discovered, “AI does a really terrible job. So we have to start bringing people back on board.” I think that the people who are informed, who understand the technical side of content or the semantic side of content as well as the editorial side are going to definitely be at an advantage. I think if you are like the, I’m going to say the old-fashioned type of copywriter who just thinks about the beauty of the words or the crafting of the message, then you’re going to be at a disadvantage. But the more you can understand about how to put metadata on your content, how to write with AI in mind, how to take into account writing that won’t feed into an LLM’s bias and so on, that’s going to give you an advantage. And I think it’s a moving target. So ask me again next year, we might have a different answer.

SO: Yeah. So it’ll be interesting. So we’ll do this again next year and see where we are. I’m going to wrap it up. Rahel, thank you so much for a whole bunch of really interesting comments and a bunch of things to think about as we go forward. And, Christine, I’m going to throw it back to you.

CC: Hey, everyone. Thank you so much for being here on the show. And, Rahel, excuse me, sorry about that, thank you so much for joining us today. For all the attendees watching this webinar, if you can rate and provide feedback again, that’s really helpful for us to know what other topics you’re looking for and interested in, other things you’re looking for. It’s really helpful for us. Also, our next show is March 12th. That’s going to be featuring Scott Abel, who is the owner of the Content Wrangler. He’s going to be talking about transforming the future content ops in the age of AI. So, again, that is March 12th. So be sure to save the date for that. And thank you so much. We’ll see you next time.