Machine experience (MX): Making content work for humans and machines
Podcast: Play in new window | Download
Subscribe: Apple Podcasts | Spotify | Amazon Music | Email | TuneIn | RSS
Your website may look great to humans, but can machines understand it? In this episode, Sarah O’Keefe (Scriptorium) and Tom Cranstoun (Digital Domain Technologies) explore the emerging discipline of machine experience (MX). Sarah and Tom discuss what AI agents actually encounter when they visit your web pages, why microdata and metadata are critical, and what content creators must do to ensure content is consumable for both human and machine audiences.
Tom Cranstoun: Humans are looking for pictures, they’re looking for text, and they can infer. You may think, “Well, we’ve already added information on the page,” but by putting it in as microdata, it doesn’t appear on the page for the humans. It appears on the page for the machine. I think that that’s a critical distinction. We are trying to design for both. We don’t want to overload a human with information, but we do want to give the machine as much information as it can take.
Related links:
LinkedIn:
- Host: Sarah O’Keefe
- Guest: Tom Cranstoun
Transcript:
Disclaimer: This is a machine-generated transcript with edits.
Introduction with ambient background music
Christine Cuellar: From Scriptorium, this is Content Operations, a show that delivers industry-leading insights for global organizations.
Bill Swallow: In the end, you have a unified experience so that people aren’t relearning how to engage with your content in every context you produce it.
Sarah O’Keefe: Change is perceived as being risky; you have to convince me that making the change is less risky than not making the change.
Alan Pringle: And at some point, you are going to have tools, technology, and processes that no longer support your needs, so if you think about that ahead of time, you’re going to be much better off.
End of introduction
Sarah O’Keefe: Hey, everyone. I’m Sarah O’Keefe. Today, our guest is Tom Cranstoun, who is founder of a machine experience, or MX community, called The Gathering. He has a couple of books on MX and is currently a consultant operating as Digital Domain Technologies. Tom, after 53 years in the business, some experience with AEM at very, very large companies, including a huge project at Nissan, has turned his attention to the question of how machines, which is to say AI agents, interoperate with the current public-facing web. And so today, Tom, I’m delighted to have you on to talk with you about machine experience, or MX, and what this all means as we move forward in this brave new AI world. So welcome.
Tom Cranstoun: Thank you, Sarah. I’m very pleased to be with you today.
SO: I am delighted to have you. So I guess we’ll start with the extreme basics here, which is what is machine experience, or MX?
TC: Yeah. MX, well, to my definition, machine experience is like user experience, but it’s for machines. Machines cannot ask a friend for help if something goes wrong when they’re browsing a website. They can’t turn to a partner and say, “What do you think this means?” They can’t retry a failing form input because they will just go through the same mechanical patterns to try and carry on throughout the web journeys. Therefore, machine experience is thinking about what elements one must put on a webpage to help a machine understand and action the final goal of the webpage, whether that be a CTA that lets you purchase something, or an information document that lets you know about a government policy, or a charity good, whatever the author of the page is trying to get across to the audience.
SO: And so at a high level, what does it look like to build out machine experience? What are some examples of things that you need to put onto a webpage to accommodate the machine that’s reading it?
TC: Well, the very first level is the disabilities angle, things like the Americans with Disabilities Act, that kind of WCAG, W-C-A-G, the accessibility work. The more accessibility information is on the page, the more the machine can understand the background of the page. So machine experience and accessibility are pretty much at the top level, the same sort of thing. If you put in JSON-LD, microdata, and you enrich your pages with the things that Americans with Disabilities Act would like, you’re actually helping a machine understand the page. So that is the top-level constraint. When you go below that level, you need to give the machine lots of information about your product, not just the thing that a human wants when it’s glancing at the page now, and as you go through the journeys, things will be added on. Humans can only take in two or three items at a time, so we design pages to reveal what is happening. You go to a catalog, to a product, to a variation, to a purchase, four different steps. Each step introduces different pricing and concepts. It’s best to feed the machine on the page that the machine lands on with all of the information that it needs. This may not necessarily be surfaced to the human reading the page, but it’s there for the machine. This helps the machine when it arrives at your webpage.
SO: So I’m really enjoying this concept that a properly organized page with proper accessibility WCAG or ADA compliance and support then results in the machine being better able to parse the page for essentially the same reason, right? It’s properly structured, it’s predictable. The things that are labeled are labeled correctly. I don’t know that we should be driving accessibility in order to enable AI, but on the other hand, if it gets us more accessible pages, then let’s certainly do that. Can you give some examples of what happens when pages are not machine-compatible? What are the kinds of problems that people run… Or not people. What are the kinds of problems that the AIs run into when they try to parse a page that has not been labeled properly or encoded properly?
TC: Yeah, I collect these examples from real life. Whenever I use the web as a normal person, I say, “Well, how would a machine interpret this?” Recently, I was looking for a holiday, and I asked an LLM to give me a list of five companies that offer cruises up the Mekong Delta. The machine came back with one offer at $200,000 for a week’s holiday, and the rest of them were $2,000 for a week’s holiday. What had happened there was that the machine had found a European website. Now, the Europeans changed the comma and the dot in monetary labels differently from what the Anglo-Americans do. We use a comma separator between thousands and a full stop between fractions. The Europeans actually put the full stop as the thousand separator and a comma between the fractions. This meant that when the LLM built a table of prices for holidays, it didn’t understand the distinction, and it tripped up. The agent hadn’t been instructed to compare prices and make sure that they were all within the same range and were reasonable. It just produced them as a matter of a fact. “Here’s a holiday for you. One of them is $200,000. The rest of them are 2,000.” There was no knowledge, no information that could tell the agents what was happening. If those pages had been decorated with currency and they had microdata with the… microdata always says that you should use commas as a separator and full stops as the fractional separator. If these things had been in the page, the machine wouldn’t have flipped up. Now, a human could have read a page and seen the locale values shown on the page, and both people would be able to understand what was going on. So that’s a typical trip-up from an undecorated page.
SO: And so essentially, the presentational component that says, because I’m serving this page to somebody in, for example, Germany, they are expecting a comma separator between the full Euro amount and the cents, the Euro cents. But that comma is essentially formatting, as opposed to data, and so here we are.
TC: Yes, correct. And the microdata has got the thing in a proper machine-readable way. The other things that we always get problems with in the world are English and American date formats. We swap the month and year around when doing short form. The machine-readable version uses ISO dates, and ISO dates put in as a microdata tells the machine categorically. It doesn’t matter what the locale is, this is the date and time.
SO: Yeah. And so as the expression of the date, whether April 1st is 1-4 or 4-1 is essentially a formatting problem.
TC: Correct. And these are not visibility problems. These are machine experience problems. So it’s layering up. You start with fixing the disability by doing machine experience, and then you fix the locality and the community values, the human factors, display factors.
SO: And so I think we’re all familiar with the concept of a customer journey, but you’re now talking about a machine or an MX journey. What does that look like? I mean, how is the machine processing of a website? How do you explore that journey and what it looks like?
TC: The machines will not discover your website, come in through your landing page, and then look for offers or products. A machine will have an idea of where it wants to go and will land straight in at a page. It will arrive five pages into your journey, and read the webpage as it is. The owner of the website has lost all of the signals about what the dwell time was on each page, how’s the reader arrived at the end location. Did they go sideways and look at other things? Those things don’t happen with machines. They go straight in, see if they can get what they can. If they can get what they can, they will action it. If they can’t, they will move on, and go to another page or another person’s website and do exactly the same to them.
So when a machine arrives at your webpage, it will not be giving you any referral details. It will not tell you what the journey it is, and it won’t tell you what else it’s interested in. You’ll just get a cold caller who will arrive and disappear. I call them invisible users. They’re invisible to your analytics, they’re invisible to your tracking, and they’re invisible to your future. You cannot tickle them and say, “Hey, you left something in the basket.” You cannot use those parts of the journey. A machine comes in and goes, gets what it wants or it doesn’t. So you must give it, front load it as much information as possible on any and every page that a machine may land on.
SO: So then coming at this from the perspective of structured content people, because a lot of what you’re talking about, I mean, is web experience, like how does what we view as the end state result of the content that we’re creating. So if I have an enormous DITA CCMS full of stuff and then I output it to some semblance of a website, your focus is on what needs to be on that website so that it is describing itself in such a way that the machine, that an AI or a crawler can go in there and pick up what it needs to and process it accurately and not offer you a vacation for $200,000. I assume you did not pick that one. So what are the opportunities? When you look at MX and then also DITA as a backend, what kinds of opportunities do you see there to map those things across and take advantage of some of the structure that perhaps is already in the XML and/or structured content systems?
TC: Yeah, I see the backend is full of good content operation stuff. Everybody has got details about pricing and dates and frequency, and there’s lots of backend information, which often doesn’t make it into the front end for people. Humans are looking for pictures, and they’re looking for text, and they can infer. They can infer if two prices are on a page and it says, “Was $200, Now $180.” A human understands that. A machine, well, depends on the quality of the machine, whether it can read and infer those things. So the backend information has to be made more visible and in a redundant manner. You may think, well, we’ve done this on the page before. We’re doing this on the page after. But by putting it in as microdata, it doesn’t appear on the page for the humans, but it appears on the page for the machine. And I think that that’s a critical distinction. We are trying to design for both. We don’t want to overload a human with information, but we do want to give the machine as much information as it can take. We don’t necessarily have to surface all of the information within the page, but we have to carry it with the page. So a page taken in isolation contains the entire story, not just the fraction that a human is looking at, which does mean that a lot more pushing off the backend from data to the front end. And some people will think that’s a waste of time, but I don’t think so. I think giving that extra material to the machine is what makes the journey successful for the machine.
SO: I’ve been to a lot of conferences in the past couple of weeks, and the conversation around what is needed for successful LLM processing, or crawling, or ingestion, or agents for that matter and what is already provided in a metadata-rich structured content system is sort of, well, we have all of this. Now, what do we do with it, and where do we put it, and how do we make sure that this all works? So it seems like this discussion around machine experience is going to help to maybe close that gap and connect the pieces such that we can do this successfully.
And so as we move into this, I know that you have some material out there, but also a community. Can you talk a little bit about the MX community, and what you’re looking for there, and what it’s called? We will put all of the links in the show notes. But what does it look like to participate in that community, and what sort of participants are you looking for?
TC: Yeah, we are looking for content creators. We are looking for business owners. We are looking for technical writers. It’s called The Gathering, gathering being a Scottish term for the gathering of the clans. We all get together to do something that’s good for the combined grouping. And then after we’ve created whatever we’re going to create, we go away and do our own things. Now The Gathering is tg.community. That’s https//tg.community. We are building a set of community-led standards to try and make it easier for machines to understand documents. The Gathering is not just interested in HTML. We’re talking about documents of all types, and we’re talking about keeping the metadata that you have in the backend of the content creation systems, whether that be data or other content creation systems, and passing it through into the end documents.
You have metadata in PowerPoint slides. You have metadata in Word documents. You have metadata in JPEGs. These, too, deserve the machine experience. If you can tell the machine details about an image inside a JPEG, then the machine doesn’t have to try and scan and interpret the image to find out what it is. It makes things so much better. And The Gathering is a community that is trying to build these as open community-led standards. One of the first things that I am proposing for the community, which was just launched on the 2nd of April, 2026, by the way, it’s very young, and we hope to build at the speed of LLMs. We need to work fast.
The key point and the key thing that helps LLMs understand your website, there’s a thing called llms.txt, which people don’t really understand and machines don’t really use. It’s a standard for describing your website in a way that a machine can help to understand, know what’s going on without reading your site map. It is not used by the machines because, one, it’s not served as HTML, and, two, it’s not in your site map. Therefore, the crawlers that build your training material do not pick it up and do not ingest it. I am suggesting, and I have it in my books, I talk about this, if you wrap the llms.txt in HTML and serve it as HTML and put it in your site map, then you will get a better response from the training stage and from the inference stage. So you are seeding the machines with the information about your website, something that is currently missing from the world, and that’s step one. There are five steps that you’ve got to go through before you can do a successful e-commerce position. And that is feed the machine, get noticed, be descriptive, be MX-aware and be citable, and MX lets all of those things happen.
SO: Perfect. Well, Tom, I know that there’s a lot to discuss here, and we could go on for a very long time, but I hope this gives people a little bit of an introduction to this idea and an opportunity, if they’re interested to reach out to you and to the community that you have. And there’s also a book or three. Any closing thoughts that you want to pass on before we close this out?
TC: My personal opinion is that I think that we should treat the machines as first-class citizens and not block them from our content and to create content that works for them. The more that we do for them, the more they will do for us. And if we start treating them as an afterthought, it’s not going to be such a good web as we could build.
SO: Okay. Well, thank you so much. I’m glad we had an opportunity to talk.
And we will, again, put the links to the various resources that Tom mentioned, including the community. There’s some RFC, some standards drafts and a manifesto and a book. We will put all of that in the show notes. So Tom, thank you again for being here, and I look forward to hearing more on this effort.
TC: Thank you very much, Sarah.
Conclusion with ambient background music
CC: Thank you for listening to Content Operations by Scriptorium. For more information, visit Scriptorium.com or check the show notes for relevant links.
