CPSC 333 --- Lecture 6 --- Friday, January 19, 1996

Creation of Entity Relationship Diagrams (Continued from Lecture #5)

3) List *transitive verbs or verb phrases* in the problem statement.
   If a transitive verb (or verb phrase) connects (having as its
   *subject* and its "object(s)") two or more things that were
   identified as *entities* in steps 1--2, and the system must
   remember the information corresponding to this verb, then the
   transitive verb is the name of a *relationship*.

   Again avoid synonyms --- take care not to include the same
   "relationship" more than once.

4) Match *attributes* (found, and rejected as entities, in
   steps 1--2) to *entities* or, if appropriate, to *relationships*.
   Any relationship that is given one or more attribute(s) in this
   step becomes an associative object (instead of a regular
   "relationship")

   Choose a *primary key* for each entity. You might discover that
   an entity is really a *weak entity* at this point.

Note: Pressman introduces "grammatical parses," but uses them
to develop *different* models than entity relationship diagrams

 Pressman's "Practitioner's Guide" --- pp. 240--246
 Pressman's "Beginner's Guide"     --- pp. 38--42

Since Pressman is developing a *different model* using this
technique, the details for the method are *also* different.
Therefore, these sections of Pressman's books are *not*
reliable references for the use of a "grammatical parse" to
develop an ERD!

Example --- Starting the ERD for the "Student Information System" ---
using the problem statement on "Handout 2"

Note: If you're lucky, then rather than creating an entirely new
system and its model, you are modifying an existing one --- *and* an
entity relationship diagram was included as part of the specification
of requirements for the old system.

In this case it would might be easier to modify the original system's
entity relationship diagram rather than using a "grammatical parse"
approach to construct a completely new one. It would be necessary to
delete any attributes, entities and relationships that are no longer
needed, as well as to add new ones representing new necessary
"information to be stored".

In any case, a "first draft" of an entity relationship diagram should
be examined in order to look for and eliminate problems that can be
expected to occur.

Distribute "Handout 3": Evaluating and Improving an Entity
Relationship Diagram

Material from Handout to be mentioned in Class:

 - Goals of these rules:

   - To make data model as simple as possible, by ensuring that it
     "really does" provide a representation of stored information
     by a set of simple tables

   - To "purge" any derivable/redundant data from the resulting
     data base.

   The existence of redundant data in a data base complicates the
   job of keeping all the data consistent as the system data is
   updated.

   Redundancies *might* be introduced later on (during design,
   implementation, or (better yet:) testing if it is discovered
   that this would allow performance or reliability requirements
   to be met. More complicated designs and implementations
   (including data or file structures) might be chosen for the
   system data for similar reasons. However, these decisions
   *shouldn't* be made until performance requirements are available
   and, ideally, it's been "proved" that the simpler implementations
   won't satisfy these requirements.

   "Why?" --- The simpler implementations are easier to *maintain*.
   ... See the notes for lecture #1 ...

   Note that when you apply these rules in order to solve the
   problems described in this handout, you may end up adding
   *supertypes* and *subtypes* (as well, perhaps, as weak entities,
   and associative objects) to your ERD.

 - Also mentioned in the handout: When a model of *system functions*
   is prepared, the two models can be compared to ensure that they're
   consistent: There should be a way (using a system function) to
   create new instances of every entity or relationship, change an
   existing instance of an entity (by changing non key attributes),
   delete existing instances, as well as access (read) instances. At
   least one function should search for an instance by using values
   for a primary key. Other functions might start with values for
   "non key" attributes, or ranges or sets of values, and produce a
   *set* of instances corresponding to the input information.

   If an entity or relationship is "write only" (information is stored
   but never used) or "read only" (there's no way to *change* the
   information) then this is evidence that either the "entity" or
   "relationship" might *not* belong in the diagram, OR that the model
   of system functions might be incomplete.

 - Additional References for this Material:

    - A few additional examples of violations of normalization
      rules: Pressman's "Software Engineering. A Practitioner's
      Approach," pp. 259--260

    - Additional "normalization rules" and "normal forms" for
      relational data bases can be found (for example) in
      C. J. Date's book, "An Introduction to Database Systems ---
      Volume 1" --- used recently as the textbook for CPSC 471.
      Some of this material could be applied to ERDs as well.

    - E. Yourdon's book, "Modern Structured Analysis," includes
      a short section on entity relationship diagram that discusses
      the structural problems included in this handout

    - S. Shlaer and S. Mellor's book, "Object-Oriented Systems
      Analysis: Modeling the World in Data" includes additional
      helpful material about entity relationship diagrams ---
      and discusses their use in "object-oriented" development  

Finally: An ERD, and a corresponding set of data tables, is something
that can be shown to a "customer" for evaluation, in order to discover
whether the stored data represented by the diagram is necessary and
sufficient.