Replatforming structured content
Scriptorium is doing a lot of replatforming projects. We have customers with existing structured content—custom XML, DocBook, and DITA—who need to move their content operations from their existing CCMS to a new system.
These transitions, even DITA to DITA, require a solid business justification. Replatforming structured content is annoying and expensive. Most often, the organization’s needs have changed, and the current platform is no longer a good fit.
Note: This post focuses on transitions into DITA. There are surely DITA to not-DITA projects out there, but they are not in our current portfolio.
Custom XML to DITA considerations
Custom XML refers to a content model that was purpose-built starting from a non-DITA baseline. Customizations of DocBook are common, but you also see other standards and XML built out from scratch.
The first 80% or so will be easy. Most content models have an element for block paragraphs, so you map <para> or <paragraph> to <p>. A <warning> becomes a <note type=”warning”>, or you can specialize to create a DITA <warning> element.
As always, the last 20% will be challenging. Typical problem areas are:
- Links, especially cross-document links
- Reused content
Publishing pipelines based on a custom XML model will have to be rebuilt for DITA content. There is a sliver of hope that you might be able to reuse some CSS-type logic.
DITA to DITA considerations
If you already have DITA content, replatforming your content should be easier. There are still a few things to keep in mind:
- Some DITA-based CCMSs extend DITA with proprietary features. If you are moving out of one of these CCMSs, you need to remap the proprietary markup onto standard DITA.
- Consider whether you want to update your content model as part of the replatforming effort. If you built on an older version of DITA, DITA v1.3 and the upcoming DITA 2.0 contain numerous useful enhancements, so you have an opportunity to refine your content model.
- A content audit is helpful in understanding how the content model is being used in production. You may find that different authors interpreted guidance differently. A replatforming project gives you a chance to do some spring cleaning on your markup.
DITA 1.3 uses keys for variables. Early versions of DITA did not support keys, so if you are updating your DITA content model, you may need to modify how variables are set up.
Additionally, some DITA CCMSs have proprietary variable features. If you are moving content out of one of these CCMSs, you need to map the proprietary variables over to something that your new CCMS supports.
DITA uses attributes to identify conditional information, and ditaval files to specify how to process conditional tags. Most likely, your conditionals inside topics use a similar approach, even in custom XML or DocBook, so the biggest challenge is extracting your conditional processing logic from the legacy system and moving it into ditaval files (or your new CCMS’s proprietary conditional logic).
DITA provides metadata at several levels—deliverable, topic, and element. Most organizations customize metadata values. For example, the audience attribute might allow for “user” and “system_administrator” in software documentation, but a hardware company needs “operator” and “technician.”
Aside from customizing metadata, the biggest challenge is making decisions about where the topic and deliverable-level metadata is stored. Most CCMSs provide a layer of metadata that you can store in the system, but you also have the option of putting metadata into the DITA files. Each approach has advantages and disadvantages—and when you replatform, the calculations change from one system to the next.
Links and URLs
If I could wave a magic wand and eliminate one replatforming challenge, this would be my choice. Links and URLs inflict a special kind of pain.
First, let’s talk about URLs. When you publish content out of a CCMS, you are going to get either:
- Semantic filenames, such as productX/subsystemY/replacing-the-battery.html
- Filenames with unique IDs, such as 431543531.html
But even if both of your CCMSs use the same approach, expect trouble. Just because both systems use unique IDs doesn’t mean that your “old” IDs will carry over to the new system.
So in addition to replatforming the CCMS, you have to think about the downstream implications when you change how you publish content.
Second, you have links of at least three different types:
- Local: links from A to B where A and B are part of a single deliverable.
- Document-to-document: C links to D where C and D are different deliverables, but part of your document set.
- External: E links to F. F is a resource somewhere in the world that you do not own.
Each of these link types requires handling to transfer them into the new system.
Versioning, baselines, and branching
One of the core features of a CCMS is storing multiple versions of the same document. With version control, you can go back and look at a particular document at any point in time. The replatforming challenges occur when your versioning gets complex. For example:
- Baselines: You want to label a particular set of files as the “released” version or as the official version 1.2. When you replatform, do you keep all of the released versions or just the most recent version?
- Branching: You have a product that is “live” in multiple versions. Some customers have version 1.2 and some customers have version 2.0. Branching allows you to manage the different versions without duplicating the entire file set. But branching features are different in every CCMS, so you may need to change your approach when you replatform.
If you’re coming from a non-DITA content model, replatforming probably requires a complete rebuild of your publishing pipelines. If your current platform is DITA-based, your transition will be more nuanced. Some items to consider:
- If you are changing the DITA content model, you’ll also have to update the publishing pipelines.
- If you are using an older version of the DITA Open Toolkit, you may want to upgrade your pipelines to a more current version.
- If you are using a proprietary CCMS-based publishing pipeline, replatforming means that you have to replace that pipeline in the new CCMS.
One big driver of replatforming projects is the need for content as a service (CaaS). If you need CaaS, you have to connect your content to the downstream content requestors.
Best practices for replatforming
- Give yourself plenty of time. Set up phases for the project.
- Separate the replatforming (systems) effort from the content modeling updates.
- Identify pain points in the current system and make the new system better.
- Avoid the “burning platform” problem. (“We need to finish before December 31 because that’s when our current CCMS maintenance contract expires.”) The pain of paying maintenance for an extra quarter or even a year is less than the pain of going live with a system that isn’t ready.
Need help replatforming structured content? Contact us.