Online documentation: what’s missing

Andy Oram
January 15, 2007

For several years I've been fascinated with technical information people get online, instead of from books or journals. Everybody looks online for help installing software, finding programming library calls, fixing bugs, and solving any other technical problem they have on their systems. (Systems, not computers--computers are boring, Steve Jobs told us that. Nobody wants to be associated with computers any more.)

A lot of information is still missing online. That's good for O'Reilly, because it means the book industry still plays an important role. But this article is not about missing information in general. It's about a specific segment of missing information: the frustration you feel when you visit a site and don't understand it because it lacks some background you need.

I encountered innumerable background issues while coding up some JavaScript recently. For instance, after cloning a large and complex node, I was stymied for a long time because I tried to alter children of the node and kept being told they didn't exist. It was a surprise to find, ultimately, that white space between nested elements counts as a child node. (The log call in Firebug helped a lot to clear up the mystery.) This is just one example where a reader needs to bring his own background to online documentation.

Why information is missing

Missing information is not such a widespread problem in books, because authors have space to lay out background. In fact, I believe the defining characteristic of a well-written technical book is its success in providing the necessary background.

Journals have stable readerships, so authors can often guess the background their readers bring to their articles.

In contrast to these traditional media, online documentation tends to be short, hastily generated, and written in response to an immediate need. These documents are usually written for a narrow audience as conceived by the author (often an audience whose knowledge is very close to that of the author). The mediating advice of an editor is rarely available to online authors, except in the formal, paid articles put up by vendors or professional journals.

Therefore, much of what you see online is weak on background. In effect, the document probably wasn't written for you; it was written for an imaginary reader that matches the author's assumptions. This documentation can be very useful--and I am just as grateful as anyone to the people who write it--once you know what the author is talking about. You have to fill in the background yourself.

So what do readers do? Unless they accept their fate gracefully and go buy a book (highly recommended), they go searching for more online documents, and patch together a solution from multiple sources. But the fragmentation of knowledge in this procedure compounds the background problem, because each source comes from an author with different assumptions and makes different demands on the reader. The context is different, and even the terminology may be different.

The diversity of speakers in hypertext is called multivocality (I love that word!). Multivocality may not the best word for this situation in technical documentation, actually, because the problem here is not the different voices but the different assumptions. Still, this analysis reveals one of the crying needs of community documentation: although it's easy to follow links to different documents that contain pieces of an answer, it's harder to make the mental links between these documents, written as they are with different assumptions and in different contexts.

A possible solution: collaborative background

Some of the more formal documents one finds online recognize the importance of background, and list prerequisites ("You are expected to know JavaScript..."). This is a gracious gesture, but it's often unnecessary because these coarse-grained requirements can be picked up implicitly. A glance at a page loaded with JavaScript examples tells a reader right away that the page is for JavaScript programmers.

It's the more subtle and fine-grained requirements that should be made explicit--for instance, the problem I encountered in trying to count child nodes. But one can't expect the author to anticipate all the dumb questions readers will have. How can we fill in the background?

This is a promising area for collaboration. Documents could include forms for readers to submit background documents. A reader who gets confused and goes elsewhere for background can return to the page and submit a note saying "Before reading this document, I read...in order to find out more about..." I would leave it up to the document's maintainer to review these suggestions and decide whether to include them. But it's a simple way to make online information fulfill its rich potential, by making it more than the sum of its parts.