LIS 558 - Database design problems

As an example of a design that creates problems, suppose that we have a bibliographic database of scientific articles and that, for each article, we include the names and institutional affiliations of the authors.


Redundancy is unnecessary repetition of data.

If we include the institutional affiliation with every article written by a given author, this seems redundant. We should be able to store the institional affiliation information in one place.


Update anomalies

When authors change their institutional affiliations, should all the relevant bibliographic records be updated? If so, how can we be sure of catching all the relevant records? The form of an author's name can vary slightly from one article to another.

Insertion anomalies

Suppose we want to enter information on the institutional affiliation of a scientist who has not written any of the articles in our database. Do we create a dummy article? Do we try to find an article written by this person to include, even if the topic of the article does not fall within the scope of the database?

Deletion anomalies

If, when weeding our database, we happen to delete all articles by a particular scientist, will we lose the information on the scientist's institutional affiliation?

Multi-valued problems

Scientific articles often have more than one author. What do we do with additional authors?

If we add a bibliographic record for each author, repeating the same other information (title, source, etc.) about the article, this is redundant, as well as creating other problems.

If we add fields for second author, third author, and so on, how many fields do we add? Field space will be wasted for articles with only one or two authors. Searching by author name or creating a list of authors also becomes more complicated.

If we include all the authors' names in a single field, how will we search for a single authorís name or create a list of authors? How will we match the right institution to the right author?

Various design techniques, including entity-relationship modeling and normalization, can be used to overcome certain problems arising from poor database design.

Last updated July 5, 2001.
This page maintained by Prof. Tim Craven
E-mail (text/plain only):
Faculty of Information and Media Studies
University of Western Ontario,
London, Ontario
Canada, N6A 5B7