Information Architecture: Catalog vs. Index

These terms do not refer to the same things. They are often misused in common speech. Both have value for consumers of information widely available on the internet. They are described here, their differences elaborated, and suggestions for future use are described.

Dictionary Definitions

Catalog:  1: list, register  2a: a complete enumeration of items arranged systematically with descriptive details  b: a pamphlet or book that contains such a list  c: material in such a list.

Index:  1: a list arranged usually in alphabetical order of some specified datum (as author, subject, or keyword) a: a list of items (as topics or names) treated in a printed work that gives for each item the page number where it may be found.

Entries in both catalogs and indexes are references to things located elsewhere.

Key Differences

Feature

Catalog

Index

Which items are included

all

selected—only key topics and names

How are items arranged

systematically, i.e., by taxonometric classification

alphabetically

Contents of list item

name or topic
taxonometric class
metadata

name or topic
page number

Common Uses

A catalog we are all familiar with is the public library card catalog. The contents of a catalog entry differ between fiction and non-fiction books. Both contain the title, author, publisher, publication date, and page count. In addition, the entry for a non-fiction book contains the classification code of the main subject and a list of related subjects.

The libraries I remember use the Dewey Decimal System (DDC) as the system of classification for non-fiction works. This system uses a number from 1 to 999 with optional decimals (for example, 123.45 is for philosophy, determinism & indeterminism) to accommodate finer hierarchical distinctions in the subject. These numbers are written on the spine of each book and the books are shelved in numerical order.

The Library of Congress uses a different classification scheme, LCC, which has been criticized as being essentially enumerative (thus providing a guide to the books actually in the library) and not an actual classification of everything. It is larger and more complex than the DDC.

The focused collections of content on corporate websites often warrant a classification scheme specific to the scope of the planned content, its audience, and their needs. This is because the goal of such a classification scheme is to minimize the effort needed to find the right unit of content.

An index we are all familiar with is the one at the back of most non-fiction books. These indexes are composed of entries in alphabetical order. The entries have either a page number or a cross-reference to another entry. Entries can have sub-entries to provide supplementary descriptive and/or relational information. Indexes rarely exceed a handful of pages, even for books with many hundreds of pages.

Another index familiar to users of the internet is that provided by search engines. Their website provides you with tools to search their index. The search results are then a tiny subset of the overall index, each entry contains the item name, an excerpt of the content with the search words highlighted, and its address.

Opportunities

I believe the concept of a catalog has yet to be well-supported by software, especially for single users and small work groups. The following is a list of ideas that exploit the catalog concept.

  1. Application of a classification system, like the DDC, to individual pieces of content by authors. For web pages, this might be held in META tags, e.g.,
    <META TYPE="taxa" CONTENT="main subject,sub-subject"> for a local classification system
    <META TYPE="taxa" CONTENT="DDC,123.45"> for a named classification system, like DDC
    <META TYPE="metadata" CONTENT="property1=value,property2=value,property3=value">
  2. An internet-based software tool that lets users browse or search a catalog composed of items referring to web pages by their taxa. This catalog could be built and maintained in a manner similar to that used for indexing by search engines. Such a tool needs a way for users to report incorrectly classified web pages.
  3. A software tool that lets computer users (1) classify electronic files with a taxonomy and metadata and (2) view the file catalog.
[ Top of Page ]

Revision: 8-6-2009.