31
CHAPTER 2. DOCUMENT REPOSITORY
34
4
vegetables, to minerals. Grouping objects by similarity, however, is not quite as
simple as it sounds. Imagine that we throw a mixture of 30 knives, forks, and
spoons into a pile on a table and ask three people to group them by “similarity.”
Imagine our surprise when three different classifications result. One person
classifies into two groups of utensils, the long and the short. Another classifies
into three classes,—plastic, wooden, and silver. The third person classifies into
three groups,—knives, forks, and spoons. Whose classification is ‘best’? …
The lesson here should be obvious—a cl assification is no better than the
dimensions or variables on which it is based. If you follow the rules of
classification perfectly but classify on trivial dimensions, you will produce a
trivial classification. As a case in point, a classification that they have four legs or
two legs may produce a four-legged group consisting of a giraffe, a dining-room
table, and a dancing couple. Is this what we really want?
One basic secret to successful classification, then, is the ability to ascertain the
key or fundamental characteristics on which the classification is to be based. A
person who classifies mixtures of lead and gold on the basis of weight alone will
probably be sadder but wiser. It is crucial that the fundamental or defining
characteristics of the phenomena be identified. Unfortunately, there is no specific
formula for identifying key characteristics, whether the task is theory
construction, classification, or statistical analysis. In all of these diverse cases,
prior knowledge and theoretical guidance are required in order to make the right
decisions.”
Bailey’s example of people organizing the same objects into three different
classifications, each perfectly valid, illustrates precisely why categorization structures are
a good fit for organizing information in a document repository. Different people and
organizations have different ways of structuring the same data so as to best fit their needs.
For example, large semiconductor manufacturers, small auto body repair shops, and
authors of new regulations will probably each have different ways of organizing