CPSC 333: Supertypes and Subtypes

Location: [CPSC 333] [Listing by Topic] [Listing by Date] [Previous Topic] [Next Topic] Supertypes and Subtypes


This material was covered during lectures on January 20, 1997.


Student Information System, Version Five

Now, consider a different kind of extension of Version Two of the Student Information System. Unlike versions three or four, this new extension will not keep track of sections of courses. However, it will keep track of two different kinds of courses - ``graded courses,'' and ``pass/fail courses.'' Each course that the system knows about is either a graded course, or a pass/fail course, but never both at once.

Both kinds of courses have discipline codes, course numbers, and course titles, as before. For each graded course, the system also keeps track of a number of credits that the course is ``worth'' to a students. On the other hand, this information is not maintained for pass/fail courses.

A student receives a grade - one of ``A,'' ``A-,'' ``B+,'' and so on - after completion of any graded course; the student is initially awarded a ``grade'' of ``NA'' on registration, and this is changed to the ``real grade'' earned, once this is available.

At most one ``grade'' is recorded for each student in each graded course at a time (so, if a student registers in a graded course again after having passed it, then the grade that the student had earned is ``forgotten,'' and can't be recovered by the system).

On the other hand, ``pass/fail courses'' are a lot like ``courses'' are in Version Two of the system: A student can be registered in a pass/fail course and the student can have passed it (but not both); a student can register in and then withdraw from a pass/fail course as many times as the student wishes to, until the student passes it. A student can't register in a pass/fail course after having passed it.

A more complete description of this system is also available.

Description of an Unsatisfactory ERD

One way to model this would be to have two entities for the two different kinds of courses - an entity with name ``Graded Course'' and another with name ``Pass/Fail Course.'' Both would have attributes ``discipline code,'' ``course number,'' and ``course title,'' and both would have a primary key consisting of ``discipline code'' and ``course number.'' The ``type'' for the attribute ``discipline code'' of ``Graded Course'' would be the same as the ``type'' of the attribute ``discipline code'' for ``Pass/Fail Course.'' The same thing would be true for (the ``types'' of) the attribute ``course number,'' and also for ``course title.''

As well, ``Graded Course'' would have one additional attribute - ``number of credits.''

We could use an associative object between ``Student'' and ``Graded Course'' with an attribute, grade, (and, perhaps, called ``Completion'') to keep track of the grades that students have received (and registration information) for graded courses. We could use another associative object between ``Student'' and ``Pass/Fail Course,'' with an attribute called ``status'' (and, perhaps, called ``Participation'') in order to keep track of students' registrations in and passings of pass/fail courses.

Unfortunately, this solution would require that quite a bit of information be maintained about names and types of attributes in two places, - once, in a description of the attributes for ``Graded Course,'' and again, in a description of the attributes for ``Pass/Fail Course.'' In particular, the descriptions of ``discipline code,'' ``course number,'' and ``course title'' would probably need to be duplicated.

If we really did want to make sure that the types of ``both'' attributes with the same name are always the same, then this might not seem too serious when the system is first specified, but it might become a problem as the system is used and (more importantly) changed - for example, it would be quite easy for one of the descriptions of an attribute's type to be updated, without having the other description updated at the same time, or in precisely the same way.

Thus, it would be preferable to (somehow) make it clear that it isn't a ``coincidence'' that various attributes of ``Graded Course'' and ``Pass/Fail Course'' happen to have the same name and type, and it would be preferable to have information about attributes (and relationships) that are ``common to'' several entities in an ERD maintained in one place, so that there is no risk that ``multiple copies'' of information can become inconsistent.

Definition of Supertype and Subtype

Suppose now that A and B are entities in and ERD. If A is a supertype of B, then

If A is a supertype of B, then B is a subtype of A.

Thus, in a sense, every instance of B is also an instance of A. We can think of B as being a ``specialization'' of A, and we can think of A as being a ``generalization'' of B.

In general, a supertype will have two or more subtypes. (Either that, or it will be possible for the supertype to have ``instances of its own,'' which aren't also instances of the subtype.) However, a subtype is (directly) a subtype of only one supertype. Thus, the use of supertypes and subtypes in ERDs resembles the use of ``single inheritance'' between classes, in object-oriented programming. On the other hand, an entity can be a ``subtype'' of one entity and a ``supertype'' of another one, at the same time, so that an entity can be an ``indirect'' subtype of more than one entity at one (its ``direct'' supertype, the supertype of its supertype, and so on).

The Entity-Relationship Diagram for Version Five of the Student Information System

In order to show that ``Pass/Fail Course'' and ``Graded Course'' have common attributes, we will create another entity called ``Course'' and make this a supertype of both ``Pass/Fail Course'' and ``Graded Course.'' On the system's ERD, the connection of these three entities will be shown in the following way.

Picture of Supertype and Subtypes

A plain text approximation of this picture is also available.

We would consider all the attributes that are shared by or common to all subtypes to be attributes of the supertype as well, so that the attributes of ``Course'' would be ``discipline code,'' ``course number,'' and ``course title,'' and the primary key for ``Course'' would include ``discipline code'' and ``course number.'' If there were any relationships shared by all the subtypes, then these would be listed as relationships of the supertype, instead.

The only attributes that are ``explicitly'' listed for each subtype are the ones that are new - that is, the ones that aren't ``inherited'' from the supertype. Thus, the only attribute explicitly listed for ``Graded Course'' is ``number of credits,'' and there are no attributes explicitly listed for ``Pass/Fail Course,'' at all.

The subtypes will have the same primary key as their supertype.

Similarly, the only relationships (or associative objects) that are explicitly shown as connected to subtypes are the ones that haven't been shown already as being connected to (and ``inherited from'') the supertype. In this case, each of the subtypes is connected to (different) associative object that the other (and the supertype) isn't connected to.

Now, the complete ERD for this system is as follows.

Picture of Fifth ERD

This is too complicated to give a readable (and easily produced) ``ASCII approximation'' of it. However, a short description of it is available, instead.

Two ``Types'' of Supertypes

In some cases, a ``supertype'' can have instances of their own, that aren't instances of any of the supertype's subtypes. In this case, one would (probably) need to use a data table to list the instances of the supertype (alone), as well as data tables for each of the subtypes. We wouldn't generally want to use the same table for all these, because the subtypes would generally have more (and different) attributes than the supertype does.

However, it is also possible for a supertype to be a kind of ``virtual entity,'' in the sense that it has no instances of its own - that is, every ``instance'' of the supertype is also an instance of one of its subtypes. In this case, we wouldn't have a data table for the supertype (because we know that it would always be empty).

Note that a supertype of the second kind (that is, one that is a ``virtual entity'') must have at least two subtypes. Otherwise, there would be no reason for this ``supertype'' to exist!

Of course, you wouldn't list any given instance more than once, even when you have a supertype of the first kind, so that there is a data table for the supertype as well as for the subtypes.

Note that ``Course'' is the ``second'' kind of supertype - a ``virtual entity'' - in Version Five of the Student Information System, as described above.

Placement of Attributes

At some point, you will need to decide whether something should be an attribute of a supertype, or whether it should be an attribute of one or more subtypes.

You can't make something an attribute of a supertype unless it is supposed to be inherited by all of the subtypes, and unless it is supposed to have the same type in every case. That is, a subtype can't ``choose not to inherit'' something from its supertype.

However, you should choose to make something an attribute of the supertype whenever possible - that is, whenever the above rule wouldn't be violated. This is consistent with the motivation given for supertypes, above - that is, we use them to avoid having to repeat information in the ERD (or its description) needlessly.

Location: [CPSC 333] [Listing by Topic] [Listing by Date] [Previous Topic] [Next Topic] Supertypes and Subtypes


Department of Computer Science
University of Calgary

Office: (403) 220-5073
Fax: (403) 284-4707

eberly@cpsc.ucalgary.ca