8. Indexing

Creating an Index

Indexing is an iterative process. Your first pass at an index is merely the foundation on which to build your final index. The first pass will be full of similar primary entries that you need to break into secondary entries. You will probably also have problems with page ranges and spelling. See "Editing an Index" for information about possible problems.


When an Index Is Needed

A document needs an index if it has 10 or more pages. This rule applies to any type of document, from simple end-user guides to technical reference manuals.

Time Required to Create an Index

Writers usually set aside less time for preparing an index than for almost any other step in the process of creating a document. Most writers can't (or shouldn't) work on an index until the document is written. At that point, people are anxious to get their hands on the document and time is rapidly running out.

As a rule of thumb, allow one full day for every 25 pages of text. A 100-page document will take four full days to index. An experienced indexer may require less time; a first-time indexer may require more time. Certain types of documents are much more difficult to index than others and require even more time than given in this guideline.

Note - If you have your book's index prepared by a professional indexer using dedicated indexing software, approximately 50 to 60 pages of technical material can be indexed per day.

Deciding Which Parts of a Document to Index

The first decision you need to make when starting to create an index is to determine which parts of a document to index.

Selecting Topics to Index

When you take on the task of creating an index, you must first decide what the pertinent statements or topics are. A topic can be a single word, a phrase, or even a concept. A topic has no minimum or maximum size.

As you analyze a topic for inclusion in your index, decide whether the topic contains information a reader may expect to find in the index. If it does, create one or more index entries.

To determine whether a topic requires an index entry, analyze the topic for the following attributes. Create an index entry when a topic:

Don't Index Superfluous Entries

Frequently, superfluous entries are included in an index because the person creating the index erroneously refers to every occurrence of selected words or phrases in the document. The index is not a concordance, but rather an information retrieval device.

For example, assume that the following sentence appeared in the text being indexed:

    Separate chapters of this manual are devoted to the use of disk and tape storage devices.

In this example, the sentence provides no information about disk or tape storage. Therefore, a reader would gain nothing from these index entries:

    disk storage, 37
    tape storage, 37

Those entries should be included in an index only if they refer a reader directly to useful information.

Avoid Overly Global Entries

Do not use entries that are so general that they will apply to a global level of information. Such an entry is too general to inform a reader of the entry's corresponding content in the text.

For example, don't include entries such as "File Manager, creating files in" in a book about File Manager; or "features, of Lotus 1-2-3" in a book about the Lotus 1-2-3®application. Generally the only entries under the name of the application you're documenting should deal with the actual application itself, such as installing or quitting it.

Include Common Industry Terminology

Common procedures or commands may be defined by different terms depending upon the technology or company. If you know of a common synonym for a process or command, include an entry to send a reader to the term used in your book, as shown in these examples:

    search, See find
    delete, See cut
    abort, See cancel

Avoid Using Headings as Index Entries

Many writers choose to index every section head in the document. In general, you shouldn't use headings as the sole basis for an index. When you want to include the basic information contained in a heading in an index entry, avoid such entries as:

    using direct virtual memory access (DVMA), 87
    running the QuickView utility, 42
    What is a hidden file?, 28

Instead use entries such as these:

    direct virtual memory access (DVMA), using, 87
    QuickView utility, running, 42
    hidden file, definition, 28

Describing a Topic

Once you have determined that a topic merits an index entry, find one or more ways to describe it to a reader. The descriptions you create become the subjects of the index entries.

To describe a topic:

While looking for and describing topics, don't forget to create entries for Notes, Cautions, Warnings, acronyms, and abbreviations.

Anticipate a Reader's Needs

When a reader goes to an index, it is usually to answer one of the following questions:

By anticipating a reader's needs as you describe topics for your index, you can avoid many of the pitfalls of an inadequate index.

Include Only Terms a Reader Is Likely to Look Up

When creating an entry, ask yourself whether you would be likely as a user to look in the index for that entry.

Examples of some questionable entries:

    For example, "creating the file" should not be indexed under "creating."

    For example, in a book in which an exercise involves creating the buttons "Hello!" and "Adios!" don't include "Hello!" and "Adios!" as index entries.

Select Proper Words for Subjects

The words you select are an abstract of the topic. Choose words that are as descriptive as possible. Using words from the text as subjects of index entries may satisfy readers who know the terminology used in the document. For other readers, provide subjects or cross-references worded so that they can find the desired information without specific prior knowledge.

Arrange Words for Emphasis

Typically, you will make the most important word the first word of the subject. The choice of most important word depends upon what you want to stress or what is most important to a reader. For example, if the words you choose to describe a topic are "pixwin background color," the primary entries might be:

    background color, pixwin
    color, pixwin background
    pixwin background color

Be careful with the use of verbs as the main subject of an index entry. General verbs may not be helpful. Be sure the verb corresponds specifically to the task the user wants to accomplish. For example, for a discussion about how to use menu buttons, the proper index entry might be:

    menu buttons, using

Or, if there are additional (secondary) entries, you might format the entry as:

    menu buttons

Assign the Proper Font to the Entry

Keep in mind that certain terms may require different fonts if they are being referred to in different contexts. You also may need to include a word such as "command," "file," or "directory" after such terms to further clarify the entry.

    quit command

    quit, in contrast to exit

Group Entries

Grouping entries means combining entries that have common first words into primary entries and secondary entries.

When selecting a subject to be followed by secondary entries, be careful to group subjects properly. Don't merely select a word or phrase for a subject because it is common to several entries. For example, assume that the following subjects appeared in your document:

    rock, igneous
    rock, metamorphic
    rock music

To factor out "rock" and create three secondary entries would be wrong. If you analyze the use of "rock" in each entry, you can see that it is used in two different ways.

Wrong                  Right
rock                   rock
   igneous, 4-2           igneous, 4-2
   metamorphic, 4-18      metamorphic, 4-18
   music, 12-5         rock music 12-5

Create Index Entries for Notes, Cautions, and Warnings

You want a reader to be able to locate the various restrictions in your document. The types of restrictions include:

Most of the topics that qualify as restrictions are not stated as such in the text. You must analyze the text to find the restrictions to index. The wording of the subject should describe the nature of the restriction.

In the next example, the correctly worded index entry enables readers who are familiar with the document organization to ignore the entry if they already know the restriction:

    symbolic names, maximum length of

However, the incorrect wording below would require nearly every reader to look up the subject to find the restriction:

    symbolic names, restriction

In many documents, certain restrictions are identified specifically because of their importance. For example, assume the following Caution notice appeared in your document:

Caution - Never turn on or off the system unit while a diskette is in the disk drive. You may damage the diskette.

This restriction should be described in the index and flagged with the word "Caution." The entry might be:

    turning system on or off, Caution notice

Create Index Entries for Acronyms and Abbreviations

Include an acronym or abbreviation in your index if it is unique to your document (or manual set) and not likely to be found in common usage. Many acronyms and abbreviations need not be included in an index. For example, the abbreviations for most units of measure (Btu, in., lbs, and so on) are not good candidates for indexing.

When you include an acronym or abbreviation in an index, follow it with the words (in parentheses) from which it was formed.

    CCP (console command processor), 1-5

Double-post the term by adding an entry for the words that form the acronym or abbreviation, followed by the acronym or abbreviation in parentheses.

    console command processor (CCP), 1-5

Double-Posting Entries

Double-posting means identifying a topic in two different places in an index. For example, a topic that appears as "address switch" and "switch, address" is double-posted in the index. A topic that appears in three places is triple-posted, and so on.

Entry                        Double- or Triple-Posted Entries
power indicator              power indicator
                             indicator, power
C shell command interpreter  C shell command interpreter
                             command interpreter, C shell
                             interpreter, C shell command

Be careful of over-indexing with double-posting. Certain entries do not deserve double-posting. For instance, the following example might be acceptable in a document that refers to only a few commands.

Entry          Double-Posted Entries
grep command   grep command
               command, grep

However, for a manual with several commands, the entries under "command" might grow too numerous. In this case, rather than creating a primary entry of "command" with many secondary entries, index the commands under the command name - do not double-post - and include a cross-reference.

Entries           Double-Posted Entries
cat command       commands, See specific command names
grep command                                              
history command                                           

Double-posting increases the number of index entries available to a reader, which broadens the scope of the index. The knowledgable reader is not forced to scan the index for a general entry when seeking a specific topic.

Double-posting has a dramatic effect on usability; it is an essential technique for creating a high-quality index. Try to double-post entries for all key concepts and important terminology.

Keep in mind, however, that there is a fundamental tradeoff involved in double- posting an index. While an extensively double-posted index provides a denser, more comprehensive view of a document's topics, it also significantly increases your indexing workload. Be sure to include enough time in your schedule for double-posting your index.

Creating "See" and "See Also" References

You cross-reference index entries by creating "See" and "See also" references.

When to Use "See" and "See Also" References

Use a "See" reference when you have so many secondary entries that repeating them would be unreasonable.

    configuration, See measurement configuration

    measurement configuration
      applying storage thresholds
      calculating line speeds
      defining data fields
      defining entities

Use a "See" reference to send readers from a broad category to a more specific category. The next example is valid only if there are several secondary entries under "display thresholds," "exception thresholds," and "storage thresholds." Otherwise, you would double-post.

    thresholds, See display thresholds; exception thresholds; storage thresholds

Use a "See" reference to direct a reader from a term not used in the document to a term that is used as an index entry.

    cars, See automobiles

Consider using a "See also" reference to direct a reader to related information at another index entry. (Depending upon how your index is structured, you might also use a "See also" cross-reference from a specific category to a general one.)

    dBASE, 37

      See also database applications

Use a "See also" reference to avoid fourth-level entries.

    performance database

      backing up data in
        See also update, performance database
    update, performance database
      displaying status

How to Use "See" and "See Also" References

Place "See" references on the same line as the entry.

    UNIX, See operating system

Place "See also" references immediately below the entry on which they are based.

    exception conditions
       See also panel indicators
       sense command, 5-22
       status byte, 2-4
          See also PSW
       store violation, 3-7

Some noun modifiers (such as "data" and "file") are ubiquitous in the computer industry. If you have several long and complicated entries that start with the same word, readers might not look far enough to find a given topic. In this case, use a "See also" reference to help a reader.


      See also data files; data records

Never use a "See" reference with an entry that has a page number.


    structured files, 7-3

      See files, structured

Don't use unnecessary "See" references. If you can reasonably double-post an entry, do so. Readers have every right to be annoyed if you send them searching through an index just for one or two page numbers.

Wrong                            Right
command objects, See objects     command objects, 5-2, 5-8
   .                                .
   .                                .
   .                                .
objects, 5-2, 5-8                objects, 5-2, 5-8

In particular, don't send readers from a specific entry to a general entry, under which they must then search for the specific entry. However, be careful that you don't exclude general information that should be included under the specific entry as well. In the following example, "changing report attributes" must be included under the entries for specific reports, because a user might not look at the general entry.

Wrong                                 Right
forecast report, See reporting        forecast report, 3-67
.                                        changing report attributes, 3-122
.                                     .
.                                     .
reporting                             reporting
   automatic, 2-33, 3-174                automatic, 2-33, 3-174
   changing report attributes, 3-122     changing report attributes, 3-122
   forecast reports, 3-67                forecast reports, 3-67
   predefined reports, 3-71              predefined reports, 3-71

Don't include a page number with a "See also" reference.


    structured files, 7-3

      See also chaotic files, 8-4

Don't use a "See also" reference to send a reader to a duplicate (double-posted) entry in an index.


    entry-sequenced files, 7-3

      See also files, entry-sequenced
    files, entry-sequenced, 7-3

Make sure that a "See" or "See also" reference repeats the exact wording of the entry to which it refers.

Wrong                       Right
database, See PDB           database, See performance database (PDB)
.                           .
.                           .
.                           .
performance database (PDB)  performance database (PDB)

Avoiding Indexing Problems

This section explains some established rules for indexing. If you disregard them, you will confuse or annoy your reader, and appear incompetent to any reader who is knowledgable about indexing.

Don't Use a Two-Level Entry for a Single Topic

If you can use the primary entry alone, do so. A primary term with only one secondary term should be on a single text line. If you feel that the primary entry alone is misleading, rewrite it.

Wrong                   Right
optimization routines   optimization routines, 38
   use of, 38              or
                        optimization routines, use of, 38

Don't Use an Adjective Alone as a Primary Entry

Adjectives without related nouns do not provide enough information for a reader.

Wrong                     Right
implicit                  implicit commands
   logoff command, 6-4       logoff, 6-4
   open command, 6-1         open, 6-1
   wait command, 6-6         wait, 6-6

This rule applies (emphatically) to noun modifiers - that is, nouns that are being used as adjectives. This rule eliminates awkward, confusing constructions in which the primary entry is a noun relative to some secondary entries and an adjective relative to others.

Wrong               Right
data                data
   collecting, 81     collecting, 81
   files, 90          purging, 62
   purging, 62      data files, 90
records, 47         data records, 47
.                   .
.                   .
.                   .
wait                wait command, 29
   command, 29      wait parameter, 33
   parameter, 33    

Don't Use an Overly General Primary Entry

Generally, if a primary entry is followed by half a page or so of secondary entries, either the primary entry is too broad or you are over-indexing. For example, in a printer manual, the primary entry "printer" is too broad to be indexed as a term.

In a reference manual containing mostly commands, a primary entry of "commands" followed by a long list of commands is probably not helpful to a reader. Rather, each command should be indexed in its proper alphabetical place.

If you feel that it is necessary or helpful, use "commands" plus a "See" reference to send a reader to alternate methods of locating a given command.

Not Preferred      Preferred
commands           (no entry at all, if most of the document 
   alias, 14       describes commands)
   at, 19          
   batch, 23       or
   .               commands, See individual commands by name
   ypmatch, 132    or
   ypwhich, 134    
   zcat, 135       commands, summary of, 18

Don't Over-Index

Don't provide so many entries that they get in a reader's way. For example, in a reference manual containing many commands or utilities, you might be tempted to index the subheadings under each command. This often results in over-indexing, as shown below.

Not Preferred            Preferred
ast command              ast command, 33, 42
   attributes, 42        ast_process command, 44, 49   
   syntax, 33            ast_subvolume command, 51, 55
ast_process command      
   attributes, 49                                           
   syntax, 44                                               
ast_subvolume command                                    
   attributes, 55                                           
   syntax, 51                                               

Over-indexing also occurs if you create several secondary entries under a primary entry when all entries are on the same page.

Not Preferred       Preferred
input devices       input devices, 167
   buttons, 167                         
   dials, 167                           
   digitizer, 167                       
   scanner, 167                         

Don't provide two adjacent entries that are very similar. The test: If you omit an entry, can a reader still find the right place in the document?

Not Preferred              Preferred
delete_file command, 41    delete_file command, 41
deleting a file, 41                                  

Don't Under-Index

Some kinds of under-indexing are very obvious, as they do not provide enough specific information to be useful to a reader.

Not Preferred           Preferred
reports, 31 - 39, 77    reports
                           exporting, 77
                           generating, 34
                           preformatting, 33
                           specifying format, 36 - 39
                           types of, 31 - 33

Other types of under-indexing aren't obvious, except to a reader. It is especially important to index concepts, not just the terms that appear in the document.

Not Preferred         Preferred
archive command, 77   archive command, 77
                      backing up data to tape, 77
                      tape backups, 77

Don't Alphabetize Subentries by Beginning Articles, Conjunctions, or Prepositions

Use only the key term when alphabetizing entries. This enables a reader to focus on the key term in the entry.

Wrong                          Right
transformation                 transformation
   matrix representation, 16      matrix representation, 16
   of raster images, 18           process, 14
   process, 14                    of raster images, 18
   refresh buffer, 3              refresh buffer, 3

Don't Alphabetize by Symbol for Path Names, File Names, File Prefixes, or Variables

For path, file, or variable entries that begin with a symbol, alphabetize these entries by the first letter of the first word following the symbol.

Not Preferred      Preferred
Symbols            C
_config            cancel command
/etc/uucp/Limits   _config
.info              $PATH              
                   Create menu
E                  E
error reporting    error reporting
external files     /etc/uucp/Limits
                   external files                   
                   ID numbers
                   primary numbers

Don't Sort Subentries by a Word That Isn't the Key Term

Avoid beginning a subentry with a word that isn't the key term in the subentry. To help a reader, whenever possible you should reword the subentry so that the key term, rather than an irrelevant introductory word, appears at the beginning of the subentry.

Wrong                                Right
accounting                           accounting
   software for, 11                     command summary, 38
   summary of commands, 38              software for, 11
addressing                           addressing
   issues for virtual networks, 34      naming conventions, 52
   naming conventions, 52               virtual network issues, 34

Don't Capitalize Words Without a Reason

Avoid capitalizing words except proper nouns and acronyms that are necessarily capitalized.

Wrong                    Right
Data files               data files
pid (program ID number)  PID (program ID) number
ethernet                 Ethernet