CPSC 333: Creation of a ``Class Diagram''

Location: [CPSC 333] [Listing by Topic] [Listing by Date] [Previous Topic] [Next Topic] Creation of a Class Diagram

This material was covered during lectures on February 10-14, 1997.

A Process to Follow
Application of this Process to a Problem

A Process to Follow

Coad and Yourdon give a reasonably straightforward method to develop a class diagram. The process described below is based on Coad and Yourdon's method. However, it presents steps in an order that is a bit different from theirs: Coad and Yourdon recommend that you try to identify structures in the model before attributes and instance connections have been identified, rather than after.

Now, ``structures'' include ``inheritance'' (or generalization-specialization) structures, which you look for by considering common attributes and services for classes in your model. It seems like this might be easier to look for these after attributes have been considered, rather than before - so the order of steps has been changed. Indeed, one might even argue that you should try to identify services before you stop thinking about possible inheritance structures as well - but we won't push things that far.

Regardless of the order in which steps are performed, it will be clear that this method closely resembles a method for creating entity-relationship diagrams that has already been described in this course.

Another ``object-oriented'' method that can considered to be an ``analysis method'' (but that was discussed as a design method during lectures in 1997) is Class-Responsibility-Collaborator Modeling. It might be helpful to look at this method at this point, as well, because it provides additional useful about how you can go about allocating ``reponsibilities'' to classes. You'll need to do this near the end of the method that's described below, when you try to allocate services to class, and identify necessary message connections between them.

Finally, a ``confession'' might be in order: Software engineering texts frequently treat the development of a human-computer interface as something that follows the modeling of essential requirements (but comes before, or early in, software design). However, it seems that at least some assumptions are necessary about human-computer interaction in order to complete the preparation of a class diagram. In particular, we'll need to make some assumptions in order to identify services for classes, as described below.

Design of a human-computer interface is beyond the scope of this course. However, this is discussed in CPSC 481, which you can take after you pass CPSC 333.

Starting Point

Coad and Yourdon suggest that you try to gather quite a bit of information together before you try to construct a class diagram. For example, they recommend that you try to learn as much as you can about the application area, before you start. This seems sensible, but we'll forego this for now. Instead, we'll focus on the development of a class diagram from a problem statement, event list, and a specification of a similar system, if we're lucky enough to have one available.

The method will be applied, in part, to an example, below.

Finding Classes and Objects

This step resembles the first two steps in the method that has been described for the construction of an entity-relationship diagram:

A list of candidate classes is produced by choosing nouns and noun phrases in the problem statement, and then adding any leftover classes in a class diagram (modeling essential requirements) for a similar system, if one is available
The list is ``pruned'' by discarding candidates that fail to satisfy most of a set of criteria for ``classes.''

The list of criteria to be used to identify classes is similar to the list used to identify entities on an ERD. However, some of the criteria for entities are to be applied a bit differently, some extra criteria are added, and we won't insist that a candidate meet them all. Instead, we'll expect candidates to have ``most'' (nearly all) of the following properties, if we're to use it as a class.

The candidate should describe something in the problem domain that the system must either keep track of (store information about), or interact with, or both.
It should be necessary to remember information about the objects in a class that would be named by the candidate. That is, the class should have at least one ``attribute.'' Note, though, that this might (only) describe information about the object's current state, so that you might decide to keep ``candidate classes,'' even though you'd discard them as candidate entities on an ERD.
Objects in a class that would be named by the candidate must provide some service or behaviour that is required in order for the system to respond correctly to external events. (Note that this might, or might not, be just the ``behaviour'' of keeping track of and reporting the values of the objects' attributes).
It should usually be the case that objects in a class named by the candidate have multiple attributes. While this doesn't always need to be true, you should ask yourself whether the candidate should really be an attribute of something else, instead of a class by itself, if this criterion isn't satisfied. (You might choose to ``keep'' a candidate when this criterion isn't satisfied, because of services that the candidate's ``class'' must provide for the rest of the system.)
It should usually be the case that the candidate ``class'' can contain multiple instances. Again, you might make an exception and include a class when this criterion isn't satisfied, because the candidate class provides services required by the rest of the system.

There are other criteria that classes should satisfy, as well - their objects should all have the same (``common'') attributes and services. However we'll consider these criteria in a later step.

An application of this step to an example is given below.

Finding Attributes and Instance Connections

Finding attributes of classes is similar to finding attributes of entities; many of these will have been listed as a candidate ``class,'' but each will have been discarded because it fails to have multiple attributes of its own, and because any ``stored information'' requirements it represents (and services it represents, as well), could be included effectively by making it an attribute of some entity (or class) that has been included in the model.

Significant differences between attributes of entities and attributes of classes have already been described. When you identify attributes of classes, you should think more about ``patterns of use'' - the way the data will be used by the rest of the system - than you should about normalization rules, when deciding whether data should be represented by several attributes, or by one attribute (with several components).

As well, it isn't necessary to try to form keys using the attributes you've selected for classes.

Finding instance connections between classes is similar to finding binary (two-way) relationships between entities on entity-relationship diagrams.

If, in the process of allocating attributes to classes, you discover that something should be an ``attribute'' of an instance connection, then you should replace the instance with a class that has that attribute, and that has an instance connection with each of the classes that were connected by the instance connection you started with.

You'll need to discover the ``multiplicities'' of each instance connection - the number of objects in ``the other class'' that each object can be connected to, via the instance connection, in order to include this on the diagram (just as you needed to discover this information in order to determine the relationship's type when you produce an entity-relationship diagram and its data dictionary).

An application of this step to an example is given below.

Finding Structures

This step can broken down into two substeps: finding generalization-specialization structures, and finding whole-part structures.

Recall that two classes should be linked in a generalization-specialization structure - class A is a ``specialization'' of class B - if class A represents something in the problem domain that is a ``kind of'' whatever it is in the problem domain that B represents.

Consider, as well, the following property of classes, which extends the list of properties given above:

Objects in a class all have common attributes (with a value defined for each) and common services that they provide for the rest of the system.

You should create a generalization-specialization structure - and, when necessary, add ``virtual'' classes that are generalizations of classes that already exist - for the same reasons that you'd use supertypes and subtypes on an ERD - so that common attributes are defined once and then ``inherited'' by all the classes that have them, rather than being defined repeatedly (and possibly inconsistently). You can use these structures to ensure that common services are defined once, and inherited when needed, as well.

While classes should be thought of as being connected by ``generalization-specialization'' structures, it's probably better to think of the objects in classes as connected by whole-part structures. Objects in classes A and B are linked within a whole-part structure, if an object in class A can be a ``part'' or ``component'' of an object in class B (in this case, of course, B is the ``whole'' and A is the ``part'').

Use ``whole-part'' structures to link classes A and B,

when object in A represent ``assemblies of components'' (perhaps, a machine of some kind) and objects in B represent the components that the assemblies represented by A can include;
when an object in A represents some sort of physical ``container,'' and the objects in B represent things it can contain; Coad and Yourdon use an example involving an ``airplane'' and a ``pilot'' to illustrate this possibility;
when an object in A represents a club or organization of some kind (perhaps, some department that's part of a company) and objects in B can represent the club's or organizations' members (or the company's employees that work in the department)

The connection represented by a link within a whole-part structure seems to resemble - and, is perhaps a special kind of - an ``instance connection.'' Thus, at this point, you might look back at the instance connections you've added to your class diagram, and ask whether any of them should be modeled using a whole-part structure, instead.

It doesn't seem that the ``connections'' represented by whole-part structures can have attributes of their own (at least, not as Coad and Yourdon define them). That is, ``whole-part structures'' seem to model special kinds of ``instance connections'' that look like ``relationships'' in an ERD, but not special kinds of ``instance connections'' that resemble ``associative objects.'' So, if the ``connections'' have attributes their own, it will probably be easier to model them as classes, as described in the previous step.

Each ``event'' in the event list can probably be responded to by the system in a variety of ways, depending on possible cases that might arise during ``normal'' execution, but also depending on the error conditions (and combinations of them) that might be detected and that must be reported.

Now, it will generally be the case that an object in a particular class (or, perhaps, the class itself) is somehow ``notified'' when a given event occurs. It's possible that this class or object can provide a complete system response to the event using one or more of the services it provides. However, it's also quite likely that the class or object will need to access information maintained by, and request a service provided by,

This is all applied to the example, below.

Finding Services and Message Connections

We will need to identify services that each class provides, as well as services that each object in the class provides. The class will always need to provide a way to create new objects in it, and will frequently need to provide a way to list the objects that currently belong to it, as well. Services that objects provide will generally include ways to report, and change, values of attributes, as well as at least one way for an object to be deleted.

In order to determine more precisely (and more ``completely'') which services are needed in order to respond to events, and in order to determine which classes can send messages to other classes to request that services are performed, we'll need to create and use an ``event list,'' just we did in order to create a set of data flow diagrams that modeled system requirements during structured analysis.

However, it should be noted that the kinds of errors seen in examples so far might really only make sense assuming that a command-line interface or a simple menu-based interface will be used when the system is developed. It's probably reasonable to assume (or, hope) that this kind of simple (system-driven) interface will be used, if we're going to use ``structured'' analysis and design for software development.

On the other hand, the possible ``errors'' should probably be reconsidered if a more highly interactive interface, like a WIMP interface (involving Windows, Icons, Menus, and Pointers) is to be used instead - and this might be the kind of interface that will be used, if we're using object-oriented techniques for system development.

In particular, it's far less likely that the user will only get ``one chance'' to provide inputs, if a (well designed) interactive interface is being used. Furthermore, it might be the case that some ``syntax errors'' can be ruled out, if the user is being asked to identify an existing object (rather than create a new one), since the user might be asked to scroll through a list of the objects that exist, and select one from the list, rather than having to type information in.

It should be noted, too, that a user will probably have more ``choices'' of what to do than we've suggested so far: For example, a user might ask for online help when interacting with the system, or the user might request that an operation be ``cancelled'' before it's completed; we haven't allowed for these options before this.

So, while we'll continue by discussing the use of an ``event list'' in order to identify services, you should keep in mind that the ``event list'' (or, more accurately, the system's responses to the events on it) will be somewhat different than what we've assumed so far in the course.

Now, consider some event on the event list. It's likely that one particular object in a class - or, perhaps, one particular class - is notified when the event occurs (or, when the system is expected to respond to it). It's possible that this object, or class, can provide a complete response to the event by using or more of its own services.

However, it's also extremely likely that the class or object would need access to data maintained by some other object (possibly, in some other class), in order to respond to the event. It's also possible that another object (again, possibly in some other class) must be created, or deleted, as part of the system's response.

In these cases, the object (or class) that was originally informed of the event must send a message to another object (or class), to ask it to help provide the system's response to the original event. Now the object or (class) that receives this message might also need to send a message to some other object (or class), and so on. As well, once the object or class that received the message has completed its work, it will send some sort of response back to the object (or class) that sent the message in the first place. Based on this response, this object (or class) might need to send other messages to other objects (or classes), in order to complete the system's response.

A sequence of messages sent between objects (and classes), that provides the system's response to an event, is called a message thread or a thread of execution. Note that a message thread determines a sequence of objects (and/or classes) that send (and receive) messages, as well as a sequence of services that are performed. As mentioned above, you might be able to identify several message threads for each event, because of different cases or situations that can arise during normal processing, and also because of error conditions that might occur and might need special handling.

For each message thread, you should add a message connection from each class (whose object) that sends a message in the thread, to the class (whose object) that receives it, if this message connection doesn't already exist. You shouldn't add a message connection in the other direction, even though the object (or class) that received the message will probably need to send a response back again. You should also choose and add one or more services that the receiving object (or class) can use to perform whatever operation is being requested through the message.

The message threads will also be useful for testing, so we'll need to keep track of them for later use. We'll see shortly how this might be done.

Finally, if you've done all this, and have noticed that one or ore services that you thought were needed, haven't actually been used as part of the system's response(s) to any event, then you should consider this to be evidence that something's wrong. Try to discover whether your event list (or set of responses) is incomplete, so that the service is needed by something that you've missed. If that isn't the case, consider deleting the service from the model, even though you thought it would be needed.

This is applied to an example below.

Finding Subjects (if Necessary)

If the class diagram is too large to be fit comfortably onto one page, then Coad and Yourdon recommend that the set of classes on the diagram be partitioned into a set of subjects, so that each subject is a set of classes on the diagram, and each class belongs to one subject.

Coad and Yourdon suggest that subjects might correspond to ``subsystems'' of logically related classes. Thus they recommend that you try to include all of a generalization-specialization structure in the same subject, and that you try to include all of a whole-part structure in the same subject, as well. They also recommend that you try to choose subjects so that the number of instance and message connections that ``cross subject boundaries'' is kept as small as possible.

The class diagrams we'll consider in this course won't be large enough for ``subjects'' to be useful. This is true of the example given below.

Application of this Process to a Problem

Now, we'll try to apply the above process in order to produce (part of) a class diagram for Version Two of the Student Information System.

Problem Statement

We'll use (almost) the same problem statement as we did when we used this as an example for the construction of an entity-relationship diagram.

The system will keep track of the students that are registered in or that have passed academic courses.

In order to meet requests for information, it is necessary to keep track of the ID number and name (first name, middle initial, and last name) of each student that the system knows about. ID numbers are unique; that is, no two students have the same ID number. Names are not necessarily unique.

Each course has a discipline code, a course number, and a course title. No two courses have the same discipline code and also the same course title at the same time. Course titles aren't necessarily unique.

All courses are pass/fail courses. If a student fails a course then that student is automatically re-registered in the course. However, a student can withdraw from a course if the student hasn't already passed it. As well, if a student withdraws from a course then that student can register in it again. However, a student can't register again in a course that the student has already passed.

We'll only make one addition: We'll add the information that the system will be used by a ``Registrar'' (in a Registrar's office), to maintain this information.

Finding Classes and Objects

The application of the first step to this example is very similar to the application of the first two steps of the method for creating an entity-relationship diagram. Instead of repeating this all over again, the differences between the applications of the two processes will be noted here.

The two versions of the problem statements used are almost identical; the only difference is that the problem statement used here added the fact that a ``Registrar'' would be using the system. Therefore, the list of ``candidates'' that we will start with will be the same as the list obtained when the first step of the method for building ERDs was applied, except that that the noun ``Registrar'' will be added to the list.

When the the second step of the method to create ERDs was applied, all the candidates were rejected except ``course'' and ``student.''

Each of the candidates, that was disqualified, was rejected because it failed to represent something in the problem domain that the system needed to keep track of, or named something that should be an ``attribute'' (or combination of several attributes) of something else that had been accepted. These things remain true here as well, and none of the candidates that were discarded from the original list provide ``services'' for the rest of the system that would justify keeping them, so we'll discard them when considering what to keep as classes, just as we discarded them when considering what to keep as entities.

On the other hand, it's easy to argue that both ``student'' and ``course'' will have objects that will provide services that the system will require, so these both satisfy all the criteria that have been identified for classes, and we'll keep them both.

We'll keep the new candidate, ``Registrar,'' as well. However, this may be the most questionable decision that's being made: There's (almost) no information that the system will need to store about ``Registrars'' in order to function, and it's even possible that there will never be more than one ``Registrar'' object in the system. It might, very likely, be reasonable to discard ``Registrar'' as a candidate, on these grounds. We'll keep it, because it certainly does represent something in the problem domain that the system will need to interact with, so that there will be needed behaviour (services) that this well need to provide. That is, we'll assume that, at the very least, the ``Registrar'' class and object will need to interact with the person(s) using the system, in order to discover which external events have occurred, and which the system should correspond to - so, we'll use this class to support at least part of the interaction of the system with its user(s).

Thus, at the end of step 1, we will have identified three classes for this system: ``Class,'' ``Student,'' and ``Registrar.'' You'll see, shortly, that one might end up changing the list of classes when performing later steps in this process. Indeed, we'll add a fourth class to this list before we've finished applying it to this example.

Finding Attributes and Instance Connections

Again, an application of this step to find attributes for classes resembles the application of the process used to find attributes for ERDs. However, since the student's entire ``name'' is likely used by any operation that uses any part of it, it will likely be more useful to consider ``name'' to be an attribute of the class ``Student,'' and that name has three components (a ``first name,'' ``middle initial,'' and ``last name''). Similarly, since a course's ``discipline code'' and ``course number'' seem always to be used together, we'll create an attribute of course with name ``course ID'' and we'll let ``discipline code'' and ``course number'' be components of this.

Thus, we'll choose ``ID number'' and ``name'' to be attributes of ``Student,'' and we'll choose ``course ID'' and ``course title'' to be attributes of ``Course.'' At this point, we haven't found any attributes of ``Registrar.''

Now, finding instance connections for this example is (initially) similar to the application of the method for finding relationships. After carrying out this process, we would end up with two instance connections, ``is registered in,'' and ``has passed.''

However, we've already seen that an alternative entity-relationship diagram for this system could have been used, in which one associative object replaced the relationships ``is registered in'' and ``has passed'' that we'd originally found. Using the associative object (and its ``data table'') for this example allowed us to convey slight more information ``about the problem'' than the use of two relationships did, because it made it clear that a student couldn't be registered in a course if the student had also passed it.

For a similar reason, we won't use two instance connections named ``is registered in'' and ``has passed,'' to represent this information, on the class diagram we're creating for this problem. Instead, we'll add a fourth class to the class diagram, named ``Course Involvement,'' with an attribute called ``status.'' The value of ``status'' can be either ``registered'' or ``passed,'' just as in Version One of the Student Information System.

We'll also add two unnamed instance connections. One will connect``Course Involvement'' to ``Student;'' it will be possible for each ``Student'' object to be connected to zero or more ``Course Involvement'' objects, but each ``Course Involvement'' object will be connected to exactly one ``Student'' object (and it will be the responsibility of the objects in the ``Course Involvement'' class to keep track of these connections). The other will connect ``Course Involvement'' and ``Course.'' Each ``Course'' object will be connected to zero or more instances of ``Course Involvement,'' while each ``Course Involvement'' object will be connected to exactly one ``Course'' object, and, again, it will be the responsibility of the objects in the class ``Course Involvement'' to keep track of these connections.

Finding Structures

We won't add anything to the model when we apply step C to this example.

None of the classes that we've identified (Registrar, Student, Course, and Course Involvement) have attributes and services in common, so there doesn't seem to be any need to regard any of these as generalizations of any of the others, and there is also no apparent need to create new classes that generalize the ones we have. As well, there's no apparent need to create new classes that are specializations of these: In particular, as far as we know, all the attributes (for each class) apply to each object in that class. Once we've identified services, we'll be able to confirm that all the services for each class apply to (or, are supplied by) all the objects in that class, as well.

See Version Five of the Student Information for an example in which it would make sense to introduce a generalization-specialization structure - namely, one with a class called ``Course'' that is a generalization of classes called ``Pass/Fail Course'' and ``Graded Course.''

One could argue that the instance connection between ``Course Involvement'' and ``Course'' could be replaced by a whole-part structure (with ``Course'' as the ``whole,'' and ``Course Involvement'' as the ``part''). On the other hand, one could probably argue equally well that ``Course Involvement'' should be considered to be a part of ``Student'' (rather than linked by ``Student'' by an instance connection). Instead of making these changes (and, perhaps, using ``whole-part structures'' so generally that there's no clear difference between a ``whole-part'' connection and an an ``instance connection'' at all), we'll leave the class diagram as it is.

For an example in which there might be a stronger argument for the use of a ``whole part'' structure, see Version Three of the Student Information System, where it could be argued that a ``Course Section'' should be considered to be a ``part'' of a ``Course.''

Finding Services and Message Connections

Now, we will perform part of step D, by considering an event, ``Student Passes Course,'' identifying message threads that correspond to the system's possible responses to this event, and adding in message connections and defining services that would be required for these.

We'll make the following assumptions about the human-computer interface for the system that will be developed (and its design):

At the beginning of the system's response to each event, some sort of interaction will be required in order to discover which event has occurred. The ``Registrar'' class and object(s) will provide the service(s) needed for this interaction.
All other interaction is ``decentralized,'' and is allocated to the rest of the system in such a way that the duplication of prompting for inputs, syntax and other error checking, is minimalized. Thus, the ``Registrar'' class and object won't have much to do besides identifying the events that have occurred.
The system will use a modern WIMP interface, so that the scrolling windows, pointers, etc., can be used to select (existing) information, rather than requiring that it is retyped by the user.

One thing to note at this point: An even more ``decentralized'' interface design, in which there isn't even a need for a ``Registrar'' object to be used for choice of a command, is at least plausible. If the ``Registrar'' object wasn't needed for command selection, then there would probably be no need (in this system) for a ``Registrar'' class at all, and it should be deleted from the class diagram if this is the case. However, we'll assume (as above) that this class will be useful for selection of commands.

Under these assumptions, the event might be described as follows.

Event Name: Student Passes Course
Inputs: ID number, course ID
Outputs: status message
Error Conditions:

Student isn't registered in course
Student has already passed course

Several other errors might have occurred to you - syntax errors, as well as the possibility that an ID number or course ID that isn't currently in use is selected. These are all possible if the user must supply the input by typing it in, and less likely (or impossible) if the user selects inputs by choosing items from lists provided by the system.

Now, here is a message thread that might correspond to the above event.

The ``Registrar'' object is informed that this event has occurred, and sends a ``Pass Student'' message to the ``Course Involvement'' class
The ``Course Involvement'' class sends a ``Get Existing Course'' message to the ``Course'' class.
The ``Course'' class interacts with the system user, in order to select the course ID for one of the courses currently known to the system (that is, an existing ``Course'' object is identified). The ``course ID'' is returned to the ``Course Involvement'' class as a response.
The ``Course Involvement'' class sends a ``Get Existing Student'' message to the ``Student'' class
The ``Student'' class interacts with the system user, in order to select the ID number for one of the students currently known to the system (that is, an existing ``Student'' object is identified). The ``ID number'' is returned to the ``Course Involvement'' class as a response.
Now, ``Course Involvement'' class uses the course ID and ID number that have been supplied to identify a specific ``Course Involvement'' object, and sends this object a ``Pass'' message.
This ``Course Involvement'' object confirms that its status is currently ``registered,'' changes its status to ``passed,'' and returns an acknowledgment back to the ``Course Involvement'' class. The class returns this acknowledgment to the ''Registrar'' object. The ``Registrar'' object then interacts with the system user to report progress and (perhaps) prompt for the next command.

In order to support this message thread, we need to add a message connection from ``Registrar'' to ``Course Involvement,'' a message connection from ``Course Involvement'' to ``Course,'' and a message connection from ``Course Involvement'' to ``Student.'' It seems that message connections from a class to itself aren't required, using Coad and Yourdon's method (they'll always be needed, so there's no need to show them). Thus, these are all the message connections we need to add.

We also need to add the services mentioned above: A ``Pass Student'' service should be added for the ``Course Involvement'' class, and a ``Pass'' service should be added for ``Course Involvement'' objects. A ``Get Existing Course'' service should be added for the ``Course'' class, and a ``Get Existing Student'' service should be added for the ``Student'' class.

It doesn't seem that any other message threads are needed for ``normal processing,'' for this event. Two errors were identified, above. If the first error (``Student isn't registered'') occurs, then there will be no ``Course Involvement'' object corresponding to the data available to the ``Course Involvement'' class, at step 5, above, so the above message thread will be ``truncated'' at that point, and a suitable response will be returned by the ``Course Involvement'' class to the ``Registrar'' object. If the second error (``Student has already passed'') has occurred, then the error will be detected by the ``Course Involvement'' object at step 7, above, and different responses will be returned. However, the ``message thread'' will be the same, except for that.

If the user is able to ``cancel'' operations, then it's likely that the above message thread can be truncated in various ways. However, it doesn't seem likely that we'll need to add additional message connections, or services, in order to deal with this.

A brief consideration of other events, that this system will need to respond to, suggests that we'll also need to add message connections from ``Registrar'' to ``Student,'' from ``Registrar'' to ``Course,'' from ``Course'' to ``Course Involvement'' (for safe deletion of courses, and full reporting of course information), and from ``Student'' to ``Course Involvement'' (for similar reasons). It's possible that no other message connections will be required. Quite a few additional services will be needed, in order to respond to additional events.

Finding Subjects (if Necessary)

As mentioned above, the class diagrams considered in this course will be too small for ``subjects'' to be useful. Thus, there is nothing to do here, for this example.

Location: [CPSC 333] [Listing by Topic] [Listing by Date] [Previous Topic] [Next Topic] Creation of a ``Class Diagram''

Department of Computer Science
University of Calgary

Office: (403) 220-5073
Fax: (403) 284-4707

eberly@cpsc.ucalgary.ca