Document QA

A large organisation will produce many documents over the course of a year, but most organisations spend almost no time assessing their quality. Some of the documents may form the basis of projects worth hundreds of millions or billions of dollars, or set in place contractual arrangements that last for decades, making the knowledge within the documents highly valuable.

In other areas, Quality Assurance (QA) is an essential part of the process:

Eggs are inspected for quality, washed bottle are checked for cleanliness, critical welds are routinely radiographed, the composition of metal alloys are checked in a laboratory as they are produced. In the case of welds, the weldment may be subjected to a stress test – the document equivalent would be testing it with a sample of the target audience. Virtually the only time this happens now is with advertisements – other documents within the organisation can be vastly more valuable to the organisation.

Why So Little?

Why do we do so little with documents – nothing past a simple spellcheck? Is it because we have faith in the ability of people to understand what the documents mean, or is it because we sheet home the blame for failure to the person who wrote or read the document, rather than the document itself?

Our company has been working on a system to read specifications and extract the semantic structure. To be fully effective, the system needs deep knowledge in the domain in which it is working, making it unsuitable to handle a wide range of documents. Part of the system is a lexical and structural analysis tool, which does not rely on deep domain knowledge. This tool can be used to run QA on important documents. It checks structural integrity using many different types of check – a referenced item really is there, an indexed list isn’t messed up, defined terms are used coherently, acronyms are either in a glossary or in the organisation’s database, dimensional units are valid. It runs every check that is possible without reading the document in depth. The result of this check can be used as an indicator of the quality of the document.

We are suggesting that a small group of people be set up in a large organisation, with the purpose of providing QA on the documents being produced or introduced into the organisation. Their job is to act as a filter and stop rubbish getting into the system. The logical place for this filtering is just before the document is placed in the Document Management System (DMS), but checks of documents in draft form can also be made. The people in the group become familiar with all the machinery of knowledge found in documents – the way that large documents are structured, with defined terms either in a block at the start of the document or scattered through the document, a glossary, a hierarchical structure of sections, control of existence from one section to another.

As with other forms of QA, sampling is used to limit the workload. If documents from a source are regularly clean, then sampling can be rare. If documents from a source are regularly flagged with errors, then attention is paid there until the source is clean, or it is decided the documents are of limited value, and the problems resulting from the poor quality are of little consequence.

"We have our lawyers check important documents" – lawyers are looking for quite different things, and are easily overwhelmed by the details of large documents – there is too much to hold in the head at once.