CPSC 333 --- Lecture 6 --- Friday, January 19, 1996 Creation of Entity Relationship Diagrams (Continued from Lecture #5) 3) List *transitive verbs or verb phrases* in the problem statement. If a transitive verb (or verb phrase) connects (having as its *subject* and its "object(s)") two or more things that were identified as *entities* in steps 1--2, and the system must remember the information corresponding to this verb, then the transitive verb is the name of a *relationship*. Again avoid synonyms --- take care not to include the same "relationship" more than once. 4) Match *attributes* (found, and rejected as entities, in steps 1--2) to *entities* or, if appropriate, to *relationships*. Any relationship that is given one or more attribute(s) in this step becomes an associative object (instead of a regular "relationship") Choose a *primary key* for each entity. You might discover that an entity is really a *weak entity* at this point. Note: Pressman introduces "grammatical parses," but uses them to develop *different* models than entity relationship diagrams Pressman's "Practitioner's Guide" --- pp. 240--246 Pressman's "Beginner's Guide" --- pp. 38--42 Since Pressman is developing a *different model* using this technique, the details for the method are *also* different. Therefore, these sections of Pressman's books are *not* reliable references for the use of a "grammatical parse" to develop an ERD! Example --- Starting the ERD for the "Student Information System" --- using the problem statement on "Handout 2" Note: If you're lucky, then rather than creating an entirely new system and its model, you are modifying an existing one --- *and* an entity relationship diagram was included as part of the specification of requirements for the old system. In this case it would might be easier to modify the original system's entity relationship diagram rather than using a "grammatical parse" approach to construct a completely new one. It would be necessary to delete any attributes, entities and relationships that are no longer needed, as well as to add new ones representing new necessary "information to be stored". In any case, a "first draft" of an entity relationship diagram should be examined in order to look for and eliminate problems that can be expected to occur. Distribute "Handout 3": Evaluating and Improving an Entity Relationship Diagram Material from Handout to be mentioned in Class: - Goals of these rules: - To make data model as simple as possible, by ensuring that it "really does" provide a representation of stored information by a set of simple tables - To "purge" any derivable/redundant data from the resulting data base. The existence of redundant data in a data base complicates the job of keeping all the data consistent as the system data is updated. Redundancies *might* be introduced later on (during design, implementation, or (better yet:) testing if it is discovered that this would allow performance or reliability requirements to be met. More complicated designs and implementations (including data or file structures) might be chosen for the system data for similar reasons. However, these decisions *shouldn't* be made until performance requirements are available and, ideally, it's been "proved" that the simpler implementations won't satisfy these requirements. "Why?" --- The simpler implementations are easier to *maintain*. ... See the notes for lecture #1 ... Note that when you apply these rules in order to solve the problems described in this handout, you may end up adding *supertypes* and *subtypes* (as well, perhaps, as weak entities, and associative objects) to your ERD. - Also mentioned in the handout: When a model of *system functions* is prepared, the two models can be compared to ensure that they're consistent: There should be a way (using a system function) to create new instances of every entity or relationship, change an existing instance of an entity (by changing non key attributes), delete existing instances, as well as access (read) instances. At least one function should search for an instance by using values for a primary key. Other functions might start with values for "non key" attributes, or ranges or sets of values, and produce a *set* of instances corresponding to the input information. If an entity or relationship is "write only" (information is stored but never used) or "read only" (there's no way to *change* the information) then this is evidence that either the "entity" or "relationship" might *not* belong in the diagram, OR that the model of system functions might be incomplete. - Additional References for this Material: - A few additional examples of violations of normalization rules: Pressman's "Software Engineering. A Practitioner's Approach," pp. 259--260 - Additional "normalization rules" and "normal forms" for relational data bases can be found (for example) in C. J. Date's book, "An Introduction to Database Systems --- Volume 1" --- used recently as the textbook for CPSC 471. Some of this material could be applied to ERDs as well. - E. Yourdon's book, "Modern Structured Analysis," includes a short section on entity relationship diagram that discusses the structural problems included in this handout - S. Shlaer and S. Mellor's book, "Object-Oriented Systems Analysis: Modeling the World in Data" includes additional helpful material about entity relationship diagrams --- and discusses their use in "object-oriented" development Finally: An ERD, and a corresponding set of data tables, is something that can be shown to a "customer" for evaluation, in order to discover whether the stored data represented by the diagram is necessary and sufficient.