CPSC 333: Data Dictionary Definitions for ERDs

Location: [CPSC 333] [Listing by Topic] [Listing by Date] [Previous Topic] [Next Topic] Definitions for ERDs


This material was covered during lectures on January 24-27, 1997.


A data dictionary for an entity-relationship diagram should include a definition for each attribute of an entity (or associative object or weak entity or supertype or subtype), each entity, relationship, weak entity, associative object, supertype, and subtype.

Definitions of Attributes

As previously described, each attribute of something in an entity-relationship diagram should have an ``elementary data type.'' That is, either it should have one of the ``standard'' data types (such as integer, real, or string), or the data type should be an ``enumerated set'' - that is, a fixed set of ``literal'' values.

Standard Data Types

The ``kind'' that should be listed for an attribute is ``At.''

If the `` data type'' of an attribute is one of the standard data types, then the name of that standard data type should be listed as the attribute's ``type.''

It would probably be helpful to a ``reader'' of the data dictionary, if you identified the entity (or associative object, etc.) that has this as its attribute, as part of the attribute's ``description.'' You might also state here whether this attribute is part of the primary key, and (if the chosen name doesn't make this clear), you could describe what it is ``in the real world'' that this data item represents.

Other information that could be included in the ``description,'' which wouldn't be so easy to include elsewhere, includes

Note, though, that some of the above information is given elsewhere in a specification as well. For example, you will definitely be able to look at the definition of an entity, to check whether this is one of that entity's attributes. If you do include information in more than one place, then you should make sure that duplicated information is kept up-to-date (and consistent), as the specification is changed.

Finally, it might be a good idea to reread the previous warning given already about overspecification as you consider what to include in your ``description'' of an attribute.

For example, the definition of ``ID number'' (as it's used in all versions of the Student Information System is as follows.

Name: ID number
Kind: At
Type: integer
Description: A key attribute of ``Student;'' should be exactly eight digits long

Enumerated Sets

Most of the information given above for attributes with standard data types applies here, too. The primary difference is the way that the ``type'' is defined.

In this case, the entry for ``type'' should list each of the possible values that the attribute can assume. The following symbols should be used, as described:

For example, the attribute ``status'' of ``Student'' in Version One of the Student Information System might have the following definition.

Name: status
Kind: At
Type: [ `registered' | `passed' ]
Description: A non-key attribute of ``Student''

Definitions of Entities

The ``kind'' that should be listed for an entity is ``E.''

The ``type'' definition for entity should express it as an aggregation of its attributes, and should also show which attributes belong to the primary key. You'll need to use the following additional symbols in the formal definition of the entity's ``type:''

There's no specific information that you should be sure to include in the informal ``description'' of an entity. You might say what it is ``in the problem domain'' that the entity corresponds to, if the name chosen for the entity doesn't already make this clear.

For example, the entity ``Student'' in Version One of the Student Information System might have the following definition.

Name: Student
Kind: E
Type: @ID number + first name + middle initial + last name + status
Description: Corresponds to students who are registered in or who have recently passed the course.

Definitions of Relationships

The ``kind'' that should be listed for a relationship is ``R,'' together with the name of the relationship type for the relationship that is being defined. (Yes, the word ``type'' is being used here in two different ways - Sorry!)

The ``type'' definition for a relationship will resemble the ``type'' definition for an Entity: It will list the ``attributes'' corresponding to the columns of a data table for the relationship. That is, the attributes listed will be the elements of the primary keys of the entities that the relationship connects together. The ``@'' symbol will be used to identify the attributes in the relationship's primary key, just as it's used in the ``type definition'' for an entity.

The ``description'' of a relationship can be used to name the entities that the relationship connects together and to say more clearly (than is possible using the ``kind'' information) how many instances of the one entity any given instance of the other entity can be connected to.

For example, the relationship ``is registered in'' shown in the ERD for Version Two of the Student Information System could be given the following data dictionary definition.

Name: is registered in
Kind: R - Mc:Mc
Type: @ID number + @discipline code + @course number
Description Relationship between ``Student'' and ``Course.'' Each student can be registered in zero or more courses, and each course can have zero or more students registered in it.

An Exceptional Case Involving Definitions of Relationships

Problems arise when a relationship connects an instance of an entity to another instance (or instances) of the same entity.

Clearly, values for the key attributes of more than one instance of this entity must be included in an instance of this kind of relationship, in order to model this kind of connection. Therefore, it will be necessary to choose a new name for each ``copy'' of each attribute that must be included, and to include all these in the definition of the relationship. In order for the data dictionary to be complete, the new names of the copies must be defined, as well.

Consider, for example, something like the relationship ``is a prerequisite of'' that you might have introduced in order to answer Question 3(b) on Lab Exercise #1. This would have relationship type ``Mc:Mc'' and would connect instances of ``Course'' to other instances of ``Course.'' If this relationship was being added (for example), to the ERD for Version Two of the Student Information System, then you might add the following definitions to the data dictionary.

Name: is a prerequisite of
Kind: R - Mc:Mc
Type: @prerequisite discipline code + @prerequisite course number + @postrequisite discipline code + @postrequisite course number
Description: Relationship between ``Course'' and ``Course;'' each course can be a prerequisite for zero or more other courses, and each course can have zero or more other courses as prerequisites
Name: postrequisite course number
Kind: Alias
Type: course number
Description: Used to define the relationship ``is a prerequisite of;'' this is the course number for the ``senior'' course that has another as a prerequisite
Name: postrequisite discipline code
Kind: Alias
Type: discipline code
Description: Used to define the relationship ``is a prerequisite of;'' this is the discipline code for the ``senior'' course that has another as a prerequisite
Name: prerequisite course number
Kind: Alias
Type: course number
Description: Used to define the relationship ``is a prerequisite of;'' this is the course number for the ``junior'' course that is a prerequisite for another
Name:prerequisite discipline code
Kind:Alias
Type: discipline code
Description Used to define the relationship ``is a prerequisite of;'' this is the discipline code for the ``junior'' course that is a prerequisite for another

Aliases

Note the ``kind'' and ``type'' used in the last four of the above definitions. As these show, we'll use the kind ``Alias'' when we're introducing a new name for something that's already been defined, and we'll give the ``old'' name for the data item as the ``type'' of the copy.

Definitions of Weak Entities

The ``kind'' that should be listed for a weak entity is ``WE.''

The ``type'' definition for a weak entity should resemble the type definition for a (regular) entity. You'll need to include the attributes in the primary key of the entity that this weak entity ``depends on,'' in order to a list a complete primary key in the type definition.

It would be helpful to say which entity this weak entity ``depends on,'' in the informal description for it. Note again, though, that this duplicates information you could discover by looking at the ERD, so you'll need to remember to keep this up-to-date if you do include it here.

For example, the weak entity ``Course Section'' in the entity-relationship diagram for Version Three of the Student Information System might have the following definition.

Name: Course Section
Kind: WE
Type: @discipline code + @course number + @term + @year + @section number + instructor + start time + duration + location
Description: Depends on the entity ``Course''

Definitions of Associative Objects

The ``kind'' that should be listed for an associative object is ``AO,'' followed by the name of the type of associative object that is being defined (again, apologies for the use of the word ``type'' in two different ways, in these notes!).

The ``type definition'' for an associative object should resemble the type definition for an entity, in that it should list the attributes whose values would be defined in a ``data table'' for it, and an ``@'' should appear before each attribute that is included in the associative object's primary key.

You could include the same kind of extra helpful information in the description of an associative object, as you could in the description of a relationship.

For example, the associative object ``Completion'' in the entity-relationship diagram for Version Four of the Student Information System might have the following definition.

Name: Completion
Kind: AO - Mc:Mc Single
Type: @ID number + @discipline code + @course number + @term + @year + @section number + grade
Description: Each student can have completed work in zero or more course sections, and each course section can have zero or more students who have completed work in it.

Definitions of Supertypes and Subtypes

As far as their data dictionary definitions are concerned, we'll (generally) consider supertypes and subtypes just to be special cases of ``entities.'' Thus, the ``kind'' that listed for a supertype or a subtype should generally just be ``E,'' the same as for an entity.

However, it's been mentioned already that some supertypes can have instances of their own (that aren't also instances of one of the supertype's subtypes), and that some other supertypes cannot. When we define this latter kind of supertype - one without attributes of its own - then we'll add ``(Virtual)'' at the end of the ``kind,'' to make it clear that there wouldn't be a separate data table for this, in a set of data tables corresponding to the system's ERD.

The ``type definition'' of a supertype (that isn't also a subtype of something else) will look exactly like a type definition for an entity.

The ``type definition'' of a subtype will look a bit different: It will list the name of the supertype for which this is a subtype, along with any additional attributes that weren't inherited from the supertype. You won't see the symbol ``@'' anywhere in the ``type definition'' for the subtype, because it has the same primary key as its supertype, so that all the attributes in the primary key have been inherited, and aren't shown again here.

Helpful information to include in the ``description'' for a supertype would be the fact that it is one (since this might not be clear from the rest of the data dictionary definition), and a list of its subtypes. Helpful information to include in the description of a subtype would, again, include ``the fact that is one,'' and the name of its supertype (in case the name of an entity looks just like the name of an attribute, so that it isn't easy to recognize the ``supertype'' in the type definition that's been given).

Note (yet again), though, that this repeats information that can be viewed on the ERD, so that you'll need to be sure that you update this information (if you include it here) whenever the ERD is changed.

For example, in the entity-relationship diagram for Version Five of the Student Information System, the supertype ``Course'' and its subtypes

Name: Course
Kind: E (Virtual)
Type: @discipline code + @course number + course title
Description: Supertype, with subtypes ``Graded Course'' and ``Pass/Fail Course''
Name: Graded Course
Kind: E
Type: Course + number of credits
Description: Subtype of ``Course''
Name: Pass/Fail Course
Kind: E
Type: Course
Description: Subtype of ``Course''

Examples

Complete data dictionaries for the various versions of the Student Information System are available, or will be made available shortly.

Location: [CPSC 333] [Listing by Topic] [Listing by Date] [Previous Topic] [Next Topic] Definitions for ERDs


Department of Computer Science
University of Calgary

Office: (403) 220-5073
Fax: (403) 284-4707

eberly@cpsc.ucalgary.ca