The perversion of indexes

Sarah O'Keefe / Opinion15 Comments

In addition to mixed column and copyfitting, the shift from desktop publishing to structured authoring may result in the demise of the traditional index.

In the past ten years, the percentage of technical communication that crosses my desk without an index has definitely increased. Fewer people are indexing technical documentation because:

  • Indexing is hard.
  • Many technical writers do not enjoy creating an index.
  • The acceleration of content production schedules (and the elimination of actual printing) has removed a lot of slack time where indexing used to take place.

There are also a couple of additional factors at play:

  • Modular, reusable content requires much stricter adherence to style guidelines in the index and more index editing to ensure that the document compilations have coherent indexes.
  • Technical communication groups have fewer resources, and indexing is seen as a luxury.
  • There is a perception that full-text search is a reasonable equivalent* to a crafted index.
  • Creating index entries in XML editors is less pleasant than you might expect. (And I don’t expect much.)

This trend can go two ways:

  • Indexing rates continue to decline, and indexing eventually becomes a lost art, just like production editing.
  • Software vendors develop better tools, technologies, and architectures for index creation. For example, it would be nice to be able to highlight a word and have the authoring tool automagically create an index entry for that word.

I’m betting on the first option.

* It’s not.

About the Author

Sarah O'Keefe


Content strategy consultant and founder of Scriptorium Publishing. Bilingual English-German, voracious reader, water sports, knitting, and college basketball (go Blue Devils!). Aversions to raw tomatoes, eggplant, and checked baggage.

15 Comments on “The perversion of indexes”

  1. I tend to agree with you, Sarah. Indexing will probably decrease but maybe metadata will increase which may improve findability in content or be used to generate some sort of index automagically.

    I can’t speak for other XML technologies, but DITA does allow for indexing in the DITA maps which will allows for the index to be more contextual thus eliminating the problem of indexing in a vacuum. Add to that the built-in reuse capabilities and you could build an index for a deliverable from a central index to make it easier to provide consistency and adherence to guidelines.

    But, like you, I think we’ll be seeing less indexing in the future.

  2. That’s a good point, Julio. As you suggest, an index is a publication-level component, not a topic-level one, and in that sense it does stand to reason that the indexing should be done at the publication level.

    However, I’m curious how this approach might work in a highly-reusable environment. If I’m using a topic in multiple publications, but the index entries are placed at the publication-level map, then won’t I have to replicate the index entries separately for each publication? I could add them once in one map, and then (I think) conref them as needed in all the other maps, but then how do I effectively maintain those index entries over time without suffering from “conref abuse?”

    I like your idea of using a central index-only map. However, I could also see how such an approach could become unwieldy over time if you have lots of publications and lots of reuse.

  3. I agree – indexing as we know it is on the wane. I hope that Julio is right about metadata fulfilling the same need.

    The hard part of indexing is also the important part: People don’t always guess the word you use for a given concept. That’s the “search isn’t a substitute” part.

    With an index, you can reach out to your readers with *their* words, and guide them to your own words. If you’re writing about hiring people and you always call them “applicants”, people won’t find what they need by searching for the word “candidates” – but your index can include an entry that says “candidates – see applicants.”

    You can do the same thing with metadata. Let’s teach people that they *should*.

  4. Agree as well. Our web analytics also support that customers are not using the index files we still produce. Definitely not a priority given the resource pressures etc. Doing ‘less with less’ will eventually lead to the demise of indexing for online content.

  5. Is indexing really declining… or is it just changing? Aren’t tag clouds (and the like) an “index”? And with folksonomy and collaborative tagging, we’re moving from an author-created index terms to reader-created terms, which may help reduce the problem of finding the right term.

  6. My experience with indexes in online help was that most customers did not use the index at all. They relied on search. Both our QA people and the few customers who did use the index were confused by it – they expected it to be a list of topic links that exactly matched the title of each help topic.

  7. I agree to Janet. Beside the fact that in the past we spent lots of time creating index (tech writers together with developers), Indexes tend to unusable. End users use document map or search specific topic within the document.

    Karen makes a point too “With an index, you can reach out to your readers with *their* words, and guide them to your own words. If you’re writing about hiring people and you always call them “applicants”, people won’t find what they need by searching for the word “candidates” – but your index can include an entry that says “candidates – see applicants.”
    Therefore it is very difficult to make a general rule for using indexes. Maybe be could decide based on specific situation (document type). Don’t we?

  8. Lots of thoughtful responses; thank you.

    The problem that I see is that the various automated or crowd-sourced solutions, whether crowdsourced tagging or magical metadata, are not as good as a carefully crafted index. But, as I said, I suspect that for technical content, the authored, editing, produced index will go the way of the production edit — yes, it increases the quality of the final deliverable, but it takes too long and costs too much, so it’s supplanted by the inferior automated alternative.

    Hmmm. That could be a new slogan: “Inferior automated alternatives”

  9. Interesting blog post and comments!

    If the PDFs are primarily intended for on-screen use, it’s not necessarily an issue of Index OR Find/Search.
    The index functionality may be adapted to the electronic medium and be combined with the search function, and not just simply mimic the printed book format.

    For example, the Search function may be employed with pre-defined search phrases, helping readers who are not fully aware as to what to search for, or simply making it easier by not having to type.
    Synonyms and related entries can be listed below the main entry, with a visual indicator (such as dashes or indentation to imply hierarchy). Alternatively, the main entry itself can also search for synonyms, so that searching for “font” would actually search for “font OR typeface”.

    Production is significantly simpler in the long run: after the specific mechanism and list of terms are set, the exact location of entries is determined dynamically by the search function (first result displayed on page, following results shown in context in the search panel); no need to update/re-generate the index.

    Depending on setup and needs, pre-defined searches can be expanded to include related terms, synonyms, multi-document or folder searches etc.

    The following sample PDFs demonstrate possible directions:

    * Pre-defined or custom searches through a drop-down menu + Search button (visible but not-printed):
    (note: PDFs can be enabled with Acrobat Pro 8 and higher versions so that end-user custom searches using Reader may be saved)

    * Pre-defined searches through bookmarks:

    * Pre-defined searches through pop-up menus:

  10. As it happens, I finished a 4+ day indexing effort just this minute! Cleaned up and updated the index for our command-line application manual (authored in Word, for PDF publication). It definitely took too long and was too painful to do, but I’m not yet ready to dump the index altogether.

    While I tend to enjoy the indexing, much of the process was frustrating. It is the last task, and the easiest to toss at the last minute (I think I did for the last release). The tools should be better, but I guess I’m not surprised that they’re not.

    Tragically (in my view, at least). I’m going with Sarah on this one. Quality hasn’t exactly been Job One in this business, is it?

  11. Sarah, So glad you wrote about this. Love your phrase “inferior automated alternatives.”

    Like you, I hate to see indexes — and the art of indexing — on the wane. A good index has been called “the user interface to the document.” Reference books, especially, are crippled if they have no indexes.

    Amen to your point that full-text search is no substitute for a crafted index. I once knew a tech writer who REMOVED a crafted index when putting a large catalog on CD “because people could just search.” Noooooo. A good index reveals what’s been called a “knowledge structure.” It leads readers to discoveries and deepens their understanding of the book’s contents in a way that no table of contents can (or should try to).

    You’ll surely win your bet on option one. But option two will never happen if we stop asking. Come on, tool-makers. Indexing support, please!

  12. Shlomo, Thanks for the info on predefined searches. Intriguing. The possibility of technology offering a well-crafted search experience is worth exploring.

    Whatever the tool, though, the human skill required to create a good user experience of an index (or predefined search) — what goes on between the indexer’s ears — is complex. Some study of the craft is needed. For example, several years ago I took a full-day indexing workshop with Lori Lathrop, a past president of the American Society for Indexing, and it was worth every hour.

    I’d also recommend these resources:

    — “Indexing Books” (second edition) by Nancy C. Mulvany

    — “Beyond Book Indexing: How to Get Started in Web Indexing, Embedded Indexing, and Other Computer-Based Media” edited by Diane Brenner and Marilyn Rowland

    — “Single-Source Indexing” by Jan C. Wright

  13. Ben says … “As you suggest, an index is a
    publication-level component”

    I disagree. The index is a suite or higher level component. It can allow the user to find information efficiently in every book in a suite of books, or at a higher level, documents published by an organization with an indexing standard.

    It is a human controlled search algorithm, just like Google, with a smaller scale.

    As for the discussions about “What is an index in (for example) a help file … our users do not use them …” my take on that is that at least the traditional ways of indexing (including synonyms and the like) end up in the full text search … so the traditional index is still critical to finding information even if the information is in online help, the web, or whatever.

    And, because of the smaller scale, the authors in an organization have a a good chance of being better than Google because of their knowledge of the reader.

  14. Pingback: Three Solutions to the Corporate Blogging Paradox | I'd Rather Be Writing

  15. In the absence of well-done indexes, the Search function becomes even more important. PDF producers can do a lot to help end users (viewing the PDF using Adobe Reader) make the most of Search (powerful yet often much neglected, including providing pre-defined search phrases for key terms/concepts and personalized search forms (which can be saved for future searches).

    I will demonstrate different techniques to help make the Search function useful for end users in a free 1-hour webinar, Tuesday, March 6, starting 11:30am PST

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.