CPSC 333: Structured Analysis

Location: [CPSC 333] [Listing by Topic] [Listing by Date] [Previous Topic] [Next Topic] Structured Analysis


This material was covered during lectures on February 5-7, 1997.


Introduction

Structured Analysis is a method for constructing data flow diagrams (and supporting models) in order to model requirements for a system.

The main reference used to prepare these notes is

E. Yourdon
Modern Structured Analysis
Yourdon Press (a division of Prentice-Hall), 1989

Yourdon's book is on reserve in the library for use by students in this course.

This method for requirements analysis has changed pretty drastically over the years; the method for construction of data flow diagrams given in this reference (and described in this course) is very different from the method described by Yourdon as ``Classical" Structured Analysis.'' ``Classical Structured Analysis'' was an attempt to develop the requirements specification ``from the top down,'' and recommended that an existing system (which is to be replaced, rather than developed) be specified in detail, first, if a detailed specification for this wasn't already available. Working strictly from the top down often proved to be overly difficult; specifying a system in detail, when that system was to be replaced anyway, often proved to be intolerable.

A Process to Follow

The method Yourdon proposed to replace ``classical structured analysis'' dispenses with the detailed specification of a system that's to be replaced, and works more ``from the middle out'' than from the top down.

In this class, it will be assumed that we have either constructed an entity-relationship diagram for the new system already, or that we're attempting to do this at the same time as we're applying structured analysis in order to construct a set of data flow diagrams for the system.

This method will also be applied to an example in these notes.

A: Construct an Event List

An event is something that happens outside the system, that the system will be made aware of and for which the system must provide a response - by changing its internal data, generating output, or both.

We'll begin by generating a list of the events that correspond to the system that's to be specified.

To find events, check the problem statement for descriptions of occurrences that the system must know about and that will cause it to change its internal data, as well as descriptions of reports that the system will be expected to provide.

If an entity-relationship diagram for the system (that you're to specify) is available, then you should consider that, too. Try to find events for which the system's response would include creation, modification, and deletion of instances of each entity, weak entity, and associative object, creation and deletion of instances of each relationship, and events whose system response would require that all this information be read.

If this system is replacing an existing one, and the old system's ``reports'' are available, then these might be extremely useful. (Note, though, that one of the reasons for replacing the old system will be to modify these reports.)

If you've found k events, then you should assign a distinct ``event number'' between 1 and k to each. You should also assign each event a meaningful name. Since an event is something that happens outside the system (that the system will need to know about and respond to), you should give each event a simple name that describes the external activity that triggers system behaviour, and not the name for the system's internal response, instead.

It will be useful for later on if you can also identify and write down the following additional information, for each event.

At this point we are still modeling essential requirements and assuming that the technology used to implement the system is ``perfect.'' Therefore it isn't necessary, or desirable, to list errors the system might have made, at this point.

If it seems helpful (probably, because the information wouldn't be clear, otherwise) then you could give a very brief description of the system's response, at this point, as well.

The application of this step to an example is also available.

B: Add Event-Level Processes and Terminators

In this step and the next one, a single, huge, data flow diagram that represents the entire system will be developed.

``Step B'' is extremely simple (and almost completely mechanical). For each event in the event list that was generated in the previous step,

This step has also been applied to the ``Student Information System'' example.

C: Add Data Stores and Communication with Them

Add data stores needed to remember information created by the system's response to one (asynchronous) event and accessed or modified during the system's response to later events.

This information will usually (but not always) be accessed or modified during the system's response to two or more kinds of events, in which case the data store would appear to be ``shared'' by two or more of the ``event-level'' processes that were created during the previous step, once data flows between data stores and processes had been added.

Add data flows between the data stores and processes, for creation, modification, deletion, and reading of information in the data store (as needed). The event list will be useful here, since it includes the information you need to decide which stores needed to be accessed or modified in order to respond to events, as well as which stores needed to be accessed for thorough error checking.

If an entity-relationship diagram has been developed for the system then an``initial'' set of data stores can be derived from this diagram - there should be a data store for each component on the entity-relationship diagram, as previously described. Additional stores corresponding to a small (fixed) amount of information, rather than a data table, might need to be added.

Once you've added all the above data flows, it should be clear that it's possible to create, read, and delete instances of what's contained by each data store, as part of the system's response to some event. It should (probably) be possible to modify instances that are in data stores corresponding to entities, weak entities, and associative objects, as well. If this isn't the case, then both the entity-relationship diagram and the event list that's been prepared during structured analysis should be checked again.

This step has been applied to the ``Student Information System'' example, below.

D: Level Up Toward the Context Diagram

At this point, you have a data flow diagram that describes the entire system. Unfortunately, it will be huge, and it will include processes that haven't yet been described using process specifications. It's likely that some of these processes are really too complex to be described using a reasonably sized process specification, as well - so we're not finished yet.

To begin this next step, we'll use the huge data flow diagram that we've just generated, as a ``first draft'' for the level 1 diagram in the leveled set that we'll eventually produce.

While this diagram is too large - that is, it includes more than seven data stores and processes - we'll perform the following steps.

  1. Choose some small set of the processes in this diagram that either or both.
  2. Choose a name for a new, ``general'' process that includes all the processes in the set you've just chosen.
  3. Replace the set of processes in the level 1 diagram by the new, more general, process you've created.
  4. Create a lower level diagram ``refining'' the new process, that includes all the processes in the set you've just removed from the level 1 diagram.
  5. Now, move the data stores. For each data store S, so that there will still be exactly one occurrence of S, somewhere on the data flow diagrams.
  6. Use conservation of flow to add data flows to and from the new process on the level 1 diagram (based on the data flows to and from the processes you've moved to a lower level).
  7. Renumber the processes on the level 1 diagram, all the processes you've just moved down to a new level 2 diagram, and all the processes (directly or indirectly) below those, so that your diagrams still use the numbering scheme that has already been described for processes on leveled sets of DFDs.

Once all this has been done, so that the level 1 diagram contains at most seven data stores and processes, you should finish this step by doing the following.

This step is applied to Version One of the ``Student Information System'' example, below.

E: Level Down to Process Specifications

At this point, the ``bottom level'' processes on the existing diagrams aren't really described in any way. Some of them may be too complex to be described using simple process specifications, so it's possible that additional data flow diagrams will be needed.

To solve these problems, do the following, until there are no ``unspecified'' processes left in the diagrams.

Choose a process P that doesn't yet have either a lower level diagram or a process specification (if there's more than one of these left, any choice of P will do).

  1. Try to create a process specification for P. It should be possible to do this - although you might discover that you need more information about what the process really should do (and you may need to ask potential system users for these details). Don't spend too much time on the process specification at this point, since you'll have the chance to clean it up later.
  2. If the resulting process specification is short (at most two or three pages long, and ideally even shorter) and simple, then use this process specification, and remove process P from your set of ``unspecified'' processes. However: if you discover at this point that P needs internal ``memory'' in order to function (that it is, it must remember information left over from times it's been called previously, in order to work correctly in the future), then you should create a new data store that P ``talks to,'' and then revise the specification of P so that it really is a ``memoryless'' process, before you're done with it.
  3. Otherwise, since P is too complicated to be specified by a process specification of reasonable size, you should create a new lower level data flow diagram that refines process P. Now, you can remove process P from your set of ``unspecified'' processes, but you must add all the processes in the lower level diagram that ``refines P'' to this set.
  4. Working from your ``draft'' process specification" for P, create draft process specifications for each of the processes in the lower level diagram that refines P (and throw the old ``process specification for P'' away).

Since this causes new processes (which need specifications) to be added at the same time as old ones are dealt with, it might not seem obvious that progress is being made. However, the new processes being created are all simpler than the old ones they replace, so you'll find that more and more of them can be specified with simple process specifications - and so that, eventually, the above procedure will terminate.

How Should Processes be Decomposed?

Use your draft process specification for P as a guide.

It may be possible to take a ``functional'' approach: Try to identify major ``subtasks'' in the process specification for P, and add processes for each of these in your lower level diagram.

Consider also the ``flow of data'' within P. You might decide to add processes that validate syntax, and perform necessary cross checking, for individual inputs, or for related sets of inputs. You might also decide to add one more processes that construct and format one or more of the outputs.

Note that, until now, we haven't added data flows directly between processes, because all the processes we've seen so far have been responsible for dealing with separate asynchronous events. Therefore we haven't been able to depend on the processes being active at the same time.

Now that we're refining a process responsible for handling a single event, this isn't the case. You might be able to assume that two (or more) processes are active at the same time - so that these processes can communicate directly, and so that you can include data flows directly between them.

Finally, you may notice at this point that you're specifying ``the same process'' over and over again, because the same process is part of several of the event level processes. For example, a process for validating the syntax of a given input is probably part of every ``event level'' process that can receive that input.

Don't try to ``combine the processes on the diagrams'' at this point; you'll create a mess. However, it will be acceptable if you develop a reasonable process specification for one ``copy'' of the process, and then refer back to it in the ``process specifications'' for all the other copies of this. Later on, during system design, you'll be able to deal with this problem more effectively.

This is all applied to Version One of the ``Student Information System,'' below.

F: Complete the Process Specifications and Data Dictionary

``Clean up'' your process specifications, and add any additional detail (that you feel is necessary) to them. Extend the existing data dictionary for the entity-relationship diagram (if one is available) to add definitions for data stores and data flows, as previously described.

Finally, this is applied to the ``Student Information System'' example, below.

Application of This Process to a Problem

Now, we'll apply the above method in order to produce a set of data flow diagrams, and supporting data dictionary and process specifications, for Version One of the Student Information System. We'll begin with a problem statement, entity-relationship diagram, and a data dictionary for the entity-relationship diagram for this system.

A: Construction of an Event List

Here are the results when the first step of the above process was applied to this problem. Of course, if two different people or groups performed this step, then there might be minor differences in what would be generated - events would certainly be named differently and, depending on what questions the analysts thought to ask, it's possible that a few of these events might have been missed and, maybe, one or two more would have been added. However, they'd probably end up with something like what's given below.

The only user of this system will the the ``Instructor'' of the course. Thus the ``Instructor'' is the terminator who will supply all inputs and receive all outputs.

An inspection of the problem statement suggests the following events.


Event Number: 1
Event Name: Student registers in the course
Inputs: ID number, name
Outputs: status message
Error Conditions:
  1. Syntactically incorrect input(s)
  2. The given ID number is already being used (that is, there's an instance of ``Student'' with this ID number in the system's data tables, already)

Event Number: 2
Event Name: Student passes the course
Inputs: ID number
Outputs: status message
Error Conditions:
  1. Syntactically incorrect input
  2. The given ID number is not in use
  3. The given ID number corresponds to a student who has already passed the course

Event Number: 3
Event Name: Student withdraws from the course
Inputs: ID number
Outputs: status message
Error Conditions:
  1. Syntactically incorrect input
  2. The given ID number is not in use
  3. The given ID number corresponds to a student who has already passed the course
Effects: Information about this student is deleted

Event Number: 4
Event Name: Instructor requests deletion of old information
Inputs: ID number
Outputs: old info, status message
Error Conditions:
  1. Syntactically correct input
Effects: All students whose ID numbers are less than or equal to the given one are deleted.

Event Number: 5
Event Name: Instructor request student information
Inputs: ID number
Outputs: student info, status message
Error Conditions:
  1. Syntactically incorrect input
  2. ID number is not in use

Here are two more events that aren't mentioned in the problem statement. These might have been added after inspection of the reports that were generated by a previous system, or after conversations with the course instructor.


Event Number: 6
Event Name: Instructor requests a list of students who are registered in the course
Inputs: No inputs are required.
Outputs: registered students
Error Conditions: None.

Event Number: 7
Event Name: Instructor requests a list of students who have passed the course
Inputs: No inputs are required.
Outputs: passed students
Error Conditions: None.

In fact, the instructor will have to do something in order to make the requests mentioned in the sixth and seventh events, and the instructor will also need to do something besides supplying an ID number, when several of the other events occur (so that the system knows which event has taken place). We won't worry about these things now; they'll be dealt with later on in requirements analysis, when a human-computer interface is developed for the system.

Finally, the problem statement mentions at least one ``occurrence'' that you might think should belong to the above list - namely, a student's failing a course. However, the problem statement makes it clear that a student who fails is automatically registered in the course over again, when the student fails, and that there's no need for the system to keep track of how many times the student has failed (or even whether the student has failed in the past, at all). So, there's no need for the system to know about this occurrence, and therefore it shouldn't be listed as an ``event.''

Event-Level Processes and Terminators

To apply step B, we'll choose the following process names to correspond to the above events.

  1. Register Student
  2. Pass Student
  3. Withdraw Student
  4. Delete Old Info
  5. Display Student Info
  6. List Registered Students
  7. List Passed Students

Now, processes will be created for each. The only supplier of input and receiver of output is the ``Instructor,'' so there will be a single terminator, called ``Instructor,'' on the data flow diagram. Once data flows between the above processes and the terminator have been drawn in, the data flow diagram will look as follows.

Picture of Partial DFD

Data Stores and Communication with Them

To apply step C to this example, we'll first inspect the entity-relationship diagram for this system. This contains only one entity, ``Student,'' so we'll begin by adding one data store, ``Students,'' to the data flow diagram. Checking the events list for the system confirms that no other data stores are needed to store any information that is required in order for the system to respond correctly to these events. Thus, no other data stores will be added.

Once data flows between processes and the data store have been added, the data flow diagram will appear as follows. Note that an instance of ``student'' is read (as well as modified or deleted) by processes 2 and 3; this is necessary in order to check for the errors that were listed, in the above events list, for the corresponding events.

Picture of Large DFD

Leveling Up to the Context Diagram

The above diagram is too large to serve as the Level 1 diagram, so we'll begin to improve it by choosing logically related processes - processes #1, 2, and 3 - and replacing them by a single, more general, process - ``Record Students' Progress.''

We'll now move the old processes #1, 2, and 3, down to a new ``level 2'' diagram. Since one or more of the old processes that are left on the ``level 1'' diagram communicate with the ``Students'' data store, we won't be able to move this data store down to level 2 at this point.

After we've used conservation of flow to determine inputs and outputs to the new process, ``Record Students' Progress,'' and after we've renumbered the processes, the new ``level 1'' diagram will look as follows.

Picture of New Level 1 DFD

The new level 2 diagram will look like the following.

Picture of New Level 2 
DFD

Since the current ``level 1'' diagram has only six processes and data stores, we could stop at this point. However, we'll take one more step, by replacing the processes currently numbered 3, 4, and 5 with a single process, ``List Student Course and Info,'' and we'd create another level 2 data flow diagram that refines this and includes the processes we've just moved off level 1.

Now, the level 1 diagram definitely is simple enough to suffice. We'll finish things up by creating a context diagram (and moving the terminator to it), as described above. The resulting context diagram is as follows:

Context Diagram

The level 1 diagram now looks as follows (with the processes renumbered in a way you might not have expected, partly to make the point that the following new numbering would be acceptable, but also partly because the instructor blew it and didn't want to spend more time renumbering bubbles on the diagrams that had already been prepared).

Context

Finally, the two level 2 diagrams that have been created by now are as follows.

First Level 2 Diagram

First Level 2 Diagram

Second Level 2 Diagram

Second Level 2 Diagram

E: Leveling Down to Process Specifications

When step E of the above process is applied to this example, you'll draft process specifications for processes #1.1., 1.2, 1.3, 2.1, 2.2, 2.3, and 3 in the above diagrams (in some order). You'll probably find (as I did) that all seven of the resulting process specifications are short and simple enough to be acceptable. Thus, it isn't necessary to create any additional data flow diagrams (or processes) for this example.

The process specifications will be made available below.

F: Completion of Process Specifications and Data Dictionary

The data dictionary that would be obtained by applying step F of the above procedure is available. To see the process specifications you could get by applying process, click on the bubble in the picture of the context diagram that appears below, keep clicking until you've reached the bottom-level process that you're interested in, and then click once more to see its specification.

Clickable Picture of 
Context Diagram

Try a Lab Exercise

A lab exercise based on this material is available.

Location: [CPSC 333] [Listing by Topic] [Listing by Date] [Previous Topic] [Next Topic] Structured Analysis


Department of Computer Science
University of Calgary

Office: (403) 220-5073
Fax: (403) 284-4707

eberly@cpsc.ucalgary.ca