CPSC 333: Creation of a ``First Cut'' Structure Chart

Location: [CPSC 333] [Listing by Topic] [Listing by Date] [Previous Topic] [Next Topic] Creation of a ``First Cut'' Structure Chart

This material was covered during lectures on February 26, 1997.

Overview
Creation of One Large Data Flow Diagram
Types of Information Flow
Processing Diagrams with ``Independent Subsystems"
Processing Diagrams with ``Transaction Flow''
Processing Diagrams with ``Transform Flow''
Processing Trivial Diagrams
Adding Data Areas and Data Flows

Overview

In this first step of Structured Design, a set of data flow diagrams and process specifications, and (if available) an entity-relationship diagram for the system is used to produce a ``first cut'' (or ``first draft'') structure chart.

This step is composed of numerous decisions, and straightforward procedures to follow for each possible outcome. Its goal is the production of a structure chart that isn't necessarily very good, but that does correspond to the requirements specification that's been given, and that can be refined or improved in a later step of Structured Design, to produce a design that can be implemented, tested, and maintained economically, for a long period of time.

Here is some good news about the method described below: It's not the end of the world if you make the ``wrong'' choice for some, or even many, of the decisions that are included in this first step. In the second stage of the method, you'll get a chance to look for and correct any problems that might have been introduced earlier on. So, if in doubt (in this first step), you should make the best decision you can without agonizing over it very much, document that decision, and then continue on.

It should be noted that this method won't work well, if you apply it to a diagram that only models ``essential'' requirement - so that, in particular, a human-computer interface hasn't been included yet. Of course, this isn't really a problem, since you shouldn't be trying to create modules or determine their control structure, until after interface and performance requirements have been considered!

Here is an outline of this first step. Each of the steps given here will be described in more detail, below, and the method will be applied to an example on a separate page.

Produce one large data flow diagram representing the entire system that is to be designed.
In this second part of the first step, a recursive process is used to map each of the processes in the data flow diagram to a module in a structure chart.
1. Identify the type of information flow shown on the diagram, as a whole. We'll consider four cases:
  1. Independent Subsystems
  2. Transaction Flow
  3. Transform Flow
  4. Trivial Diagram (at most three or four processes)
  The first is something that Yourdon and Constantine don't mention (neither do Page-Jones, or Pressman), but it does seem to arise from time to time, and is worth considering. The next two were identified by Yourdon and Constantine and are discussed in most references on Structured Design. The last is a ``base case'' needed for termination of the method.
2. Partition the data flow diagram: That is, split it up into pieces, based on the functions of the processes shown on the diagram and the type of ``information flow'' that was identified in the previous step.
3. Perform first level factoring: Create a set of ``controller'' modules that will appear at the top of the structure chart (or the ``subchart'') that is currently being developed.
4. Perform second level factoring: Create modules corresponding to some of the processes in the structure chart, connecting them to modules that have already been added. The remaining processes will belong to ``subdiagrams'' that will be processed using this second part of the first step of Structured Design, recursively, to produce ``subcharts'' whose main modules will each be connected to one of the ``controller'' modules added during ``first level factoring,'' above.
Add data areas and data flows to the structure chart.

Creation of One Large Data Flow Diagram

This should be easy, if you have a leveled set of data flow diagrams available.

Start with the context diagram, then repeat the following step for as long as possible (that is, as long as processes that are specified by lower level diagrams exist on the diagram you've obtained).

Choose a process on your diagram that's specified by a lower level data flow diagram, instead of a process specification. ``Cut and paste,'' replacing this process by the simpler processes (and any data stores) in the lower level data flow diagram. Since the conservation of flow rule was used if the leveled set of data flow diagrams was correctly prepared, you should be able to ``match up'' the data flows to and from the process being replaced, by the data flow diagrams into and out of the processes you're adding to replace it.

It should now be the case that all the processes on your diagram are refined by process specifications, and not by lower level data flow diagrams. You're ready for the next step.

Types of Information Flow

At the beginning of this step, you have a data flow diagram (possibly the one you started with, or possibly a piece of it you obtained by carving up the original one) that you wish to process. Inspect the diagram and try to identify (exactly) one of the following four kinds of diagrams, and kinds of ``information flow,'' that you have.

Independent Subsystems

Choose this case if the diagram is ``nontrivial'' - it includes four or more processes - and it is easy to split it up into pieces that represent ``subsystems,'' such that there is virtually no (direct) communication between subsystems, and the entire system works by invoking each of these subsystems, one at a time.

In the ``Student Information System'' that will eventually follow, we'll see that the diagram we start with has this property. In particular, it will include a ``startup'' process that can be considered to be one subsystem, which reads system data from a text file, and initializes data areas - and which is actively ony once, when the system is initiated. All the other processes in the system form a subsystem that repeatedly requests and obtains a user's command and executes it. This second ``subsystem'' is only activated after the first ``subsystem'' terminates. While the two subsystems share information indirectly, in the sense that the second subsystem uses the data areas that the first set up, the only direct communication between the two ``subsystems'' is the transmission of a control signal from the first subsystem to the second.

Transaction Flow

A nontrivial data flow diagram exhibits transaction flow if there is one process in the diagram - a transaction center - that results in multiple data streams flowing out of the transaction center. Each of these data streams corresponds to a major subsystem. Typically, each time the transaction center is activated, it responds by triggering activity in exactly one of these data streams.

As this may suggest, a data flow diagram exhibiting transaction flow often has a noticeable shape: One or more streams of ``input paths'' or command acquisition paths lead into the transaction center, and several activity paths stream or radiate out of the transaction center, (so that the transaction center looks a bit like the hub of a wheel, or the top of the root of a plant).

The top (or second) level of diagrams for transaction processing systems often exhibit transaction flow. These are systems that receive commands from users, when there is a fixed number of different kinds of commands that can be selected, and each of the different kinds of commands will be executed by a different subsystem (and with each subsystem corresponding to one of the ``data streams'' or ``activity paths'' that radiate out of the transaction center on the system's data flow diagram).

Examples of ``transaction processing systems'' are an automated teller (that is, banking machine), or any (other) menu based system, which starts by displaying a menu or possible commands to a user and requiring the user to choose one of those commands.

Recall that the system's data flow diagrams might be changed as details about the human computer interface are added. The ``transaction center'' may well be one of those processes that don't appear on the first data flow diagrams (for ``essential requirements'') but that are added to the data flow diagrams in order to model the interface.

Transform Flow

A data flow diagram exhibits transform flow if the subsystem it represents is centered around a central transform - a process or a set of processes. Incoming data are collected for processing by chains of processes leading from terminators into the central transform; these processes prompt for data, check for (and report) syntax and semantical errors (and can report these problems if they're discovered), and may also convert the validated inputs into ``logical'' or ``internal'' formats. Nontrivial computations using the collected (and validated) inputs are then performed in the transform center. After that, output flows from the transform center (and is formatted for display) along chains of processes toward the terminators receiving them.

In fact, almost any reasonable data flow diagram can be processed as if it exhibited transform flow: The other kinds of information flow (and kinds of data flow diagrams) can all be considered to be ``special cases'' of this one. Thus, here is an easy rule to follow in order to identify the type of information flow for a data flow diagram: If, based on an inspection of the system and the above descriptions of independent subsystems and transaction flow, you can convince yourself that one of these two cases applies, then use it. Otherwise, choose ``transform flow'' as a default (and continue).

A Trivial Diagram

This is any diagram with (approximately) three or fewer processes.

Processing Diagrams with ``Independent Subsystems"

Recall that you chose this type because the diagram was nontrivial and it was easy to break it up into pieces that represented (almost) independent subsystems.

Partitioning the Diagram

Partition the processes on the diagram into sets that correspond to the ``subsystems'' you identified. That is, choose a ``subdiagram'' for each subsystem, that includes all the processes in the subsystem. You should end up with each one of the processes on the original diagram shown on exactly one of the ``subdiagrams'' you've generated.

First Level Factoring

If necessary, create a main module for the system; give this a name that describes the system (or subsystem) it controls, but note as well that you'll probably want to use this as the name for the function that will implement it, in the computer program you eventually write.

It will be necessary to create a main module if you've just started - so that you haven't yet created any modules at all. It usually won't be necessary otherwise, because there will usually already be a module in the structure chart you've generated, that serves as the ``main module'' for this part of the system.

Second Level Factoring

Now, recursively apply this whole second part of ``Step 1'' (that is, the process to map a large data flow diagram to a set of modules), to each one of the ``subdiagrams'' you just created. Add a control connection (drawn as an arrow) from the ``main module'' you either created or identified during ``first level factoring,'' to the main module for each of the structure charts you've recursively generated during ``second level factoring,'' here.

Example

As mentioned above, this method will be applied to a large example on a separate page. This example will include the processing of a DFD with independent subsystems.

Processing Diagrams with ``Transaction Flow''

Recall that you identified this type because you found one or more ``command acquisition'' paths leading into a ``transaction center,'' and identified several ``activity paths'' leading out of it.

Partitioning the Diagram

Identify the transaction center, and identify those processes that are on the ``command acquisition path'' (or paths).

Partition the remaining processes into a set of subdiagrams, such that you have a subdiagram for each ``activity path.''

When you're done, you should find that each process except the transaction center is either a ``command acquisition'' process (on the ``command acquisition path''), or it is on at least one of the subdiagrams that correspond to activity paths.

If you developed your data flow diagrams using structured analysis then you should find that each module (that isn't the transaction center or on the command acquisition path) is in exactly one of the subdiagrams for activity paths - because, as it's been described here, structured analysis doesn't include a phase in which you look for similar processes and then combine them into a single one that's reused.

If you did use a single ``reusable'' process instead of multiple ``copies'' when creating your data flow diagrams, then you might discover that your activity paths must ``share'' some of the reusable processes you created.

In this case, the easiest way to apply the following rules is probably to duplicate the processes all over again, so that each activity path is disjoint (and each activity path has a ``copy'' of the process if it needs it). You'll have a chance, during a later design stage, to eliminate the resulting redundant processes, and it will probably be easier to take care of this then, than it would be to try to produce a ``first cut'' structure chart with reuse.

Therefore, for the rest of these notes, it will be assumed that the ``activity paths'' are disjoint (that is, they don't share any processes).

First Level Factoring

Create a ``main module'' for the (sub)system you're now examining. Create a ``controller'' module that the main module can call (and therefore that the main module ``controls''). This new ``command acquisition controller'' will be the main module for a subsystem responsible for acquiring the user's command(s). Again, choose names for these modules that describe the responsibilities of the (sub)systems they control.

Create one more module - that corresponds to the data flow diagram process you identified as the ``transaction center.'' This module should be called by the main module, as well. Use the name of the process as the name of the module that's been created for it, as well.

Second Level Factoring

Now, we'll map each of the ``command acquisition'' processes to modules on the structure chart.

Recall that these processes form one or more paths obtain a command from a user (or perhaps from another system), that might validate this command (checking for syntax errors, etc.), and that eventually pass the command(s) to the transaction center.

In order to create modules, we'll work backwards (as far as the flow of data is concerned), starting with the processes that that pass information directly to the transaction center, and working outwards towards the system.

In particular: create a module for each command acquisition process that passes data directly to the transaction center, and make each of these modules a child of (ie, have it controlled by) the controller in charge of ``command acquisition,'' that was created during first level factoring.

Then, choose a command acquisition process that doesn't have a module yet, and that passes data directly to at least one of the command acquisition processes for which modules have been created. Add a module for this process, and show that it can be called by each of the modules for processes that receive data from it. Keep doing this until no command acquisition processes are left.

For each of these modules (that correspond directly to processes) use the name of the process the module corresponds to, as the name of the module, as well.

When this is done, you should find that all the processes that communicate directly with users or other devices or systems are at the bottom of the subchart you've created, and that they are ``separated'' (that is, at a distance away) from the controller modules, and from the pieces of the structure chart corresponding to other systems. This is desirable, since it should make these modules easy to find. If the system's interface is changed (and, this is a change that is commonly made), then these should ideally be the only modules for which changes are required, and these should be easy to find.

Now, we're left with the subdiagrams that correspond to activity paths. Process these recursively, to obtain a structure chart of each one. Add a control connection from the transaction center's module, to the main module of each one of these structure (sub)charts.

Examples

Pressman gives a ``transaction analysis'' example on pages 382-387 of the third edition of Software Engineering: A Practitioner's Approach. Unfortunately, he also includes the recursive application of the methods to the activity paths, at the same time as he performs the steps that are mentioned above, so that it might not be as easy to see (as it should be) what you'd get by performing first and second level factoring to the diagram as a whole, but before you recurse.

As mentioned above, this process will be applied to a large example on a second page. This will include processing a diagram with transaction flow.

Page-Jones also discusses the creation of structure charts from data flow diagrams with ``transaction flow,'' in Chapter 10 of The Practical Guide of Structured Systems Design.

Processing Diagrams with ``Transform Flow''

Recall that you chose this case because the system has processes forming a central transform (or ``processing center'') at its heart, with paths of ``input flow'' processes leading into it, and paths of ``output flow'' processes leading out of it. This is the ``default'' case for nontrivial diagrams: That is, you should choose this if you're unable to identify either ``independent subsystems'' or ``transaction flow.''

Partitioning the Diagram

Decide whether each process is an input flow process, an output flow process, or in the central transform.

The input flow processes collect data directly from the terminators, validate the syntax of individual inputs, and might also validate semantics. It is possible that cross referencing is done (with inputs compared to one another, or with inputs compared with the contents of the system's data areas), by ``input flow'' processes, in order to do this. You might also notice that some ``input flow'' processes are also generating some simple output - to prompt for data, or report errors.

The processes in the central transform perform nontrivial computations that often involve combining the inputs together or computing values of complex functions of the inputs.

The output flow processes reformat the outputs that were generated in the central transform, so that they can be transmitted directly to the people, systems, or devices that interact with the system that's being designed.

A good way to decide whether each process is input flow, output flow, or in the central transform, is to start with processes that receive input directly from the terminators (or, perhaps prompt for input), and work in the direction of the input flow. Decide that the processes you see are ``input flow'' processes as long as it's clear that they receive data directly from terminators, or do (almost) nothing more than check for and report syntax and semantical errors in inputs, or (perhaps) convert single inputs into internal formats. Once you aren't able to do this, you can conclude that you've found all the ``input flow'' processes (and that you're now finding processes that are in the central transform).

You should be able to draw a (dashed) line that separates all the ``input flow processes'' you found, from the rest of the processes in the system. Most, or all, of the processes that are on the other side of this line (and that communicate directly with input flow processes) are in the central transform. This line is called the input flow boundary.

Now, once again, consider the processes that send output directly to the terminators, and (now) work backwards (as far as data flow is concerned), moving into the system. As long as you can decide that processes do nothing more than than interact directly with users or output devices, or reformat output for external display, then you can consider processes to be ``output flow'' processes. As soon as this seems not to be the case, you can decide that you've moved into the transform center.

You should be able to draw a (dashed) line that separates all the ``output flow'' processes you found, from the rest of the processes in the system, Most, or all, of the processes that are on the other side of this line (and that communicate directly with output flow processes) are in the central transform. This line is called the output flow boundary.

The processes that are left over after you do this - that is, the processes in the middle of the diagram, between the two flow boundaries - are all in the central transform.

Some subsystems receive virtually no user input - perhaps because they are responsible for displaying all the information of a certain kind that the system knows about. The diagrams for these systems might not have any input flow processes at all. Similarly, some subsystems supply virtually no output (except, perhaps, for error messages and status messages), and the diagrams for these might not have any output flow processes at all. However, you should always find either at least one input flow process or output flow process (or both).

Two people who apply this to the same diagram might obtain slightly different partitions - that is, they might draw the flow boundaries in slightly different places. This is not a serious problem - again, because this method does include a ``second step,'' in which design problems are checked for and corrected, and this second step often tends to reduce the differences between the structure charts you get, when you make slightly different choices early on.

First Level Factoring

Create a main module for the (sub)system being designed, and three ``controller'' modules that it can call: an input controller, a processing controller, and an output controller. Give each of these a name describing the responsibilities of the (sub)system it will control.

Second Level Factoring

Mapping ``input flow'' processes to modules is similar to mapping ``command acquisition'' processes to modules, when the diagram has transaction flow: Start by creating a module for each ``input flow'' process that sends data across the input flow boundary, to a process that's in the central transform (or, possibly, is an output flow process). Make each of these modules a child of the ``input controller,'' and use the name of the corresponding process as the name of the module as well.

Then, continue by selecting an input flow process that doesn't have a module yet, but that sends data to at least one input flow process that does - so that you're working ``backwards'' (or, ``outwards'' toward the terminators). Create a module for this process, name it with the name of the process, and show it as being controlled by each of the modules of the input flow processes to which its process sends data. Repeat this step until every input flow process has a module.

To create modules for output flow processes, start by creating a module for every output flow process that receives data directly from across the output flow boundary (from a process that's in the central transform, or possibly an input flow process). Name each module with the name of its process and make it a child of the ``output controller'' that was created during first level factoring.

Then, continue by selecting an output flow process that doesn't have a module yet, but that receives data from at least one output flow process that does - so that you're working ``forwards'' (or, again, ``outwards'') toward the terminators. Create a module for this process, name it with the name of the process, and show it as being controlled by each of the modules of the output flow processes from which its process receives data. Repeat this step until every output flow process has a module.

As for diagrams with transaction flow, you should find at the end of this that all the modules that interact directly with people (through devices) or systems are at the bottom of the structure chart, and easy to locate, in the event that the system's interface is changed.

Finally, treat the central transform as a subdiagram - so that it's to be processed by appling this method recursively.

Examples

Pressman gives an example on pages 374-381 of the third edition of Software Engineering: A Practitioner's Approach. Note, in particular, the description of the steps followed in the text, and Figures 11.7-11.13, which shows how the diagram for a ``Monitor Sensors'' system is partitioned, and how a first cut structure chart is then created from it.

This method will be applied to a large example on a separate page; this will include processing a diagram with transform flow.

Finally, Page-Jones discusses this in more detail in Chapter 10 of The Practical Guide to Structured Systems Design.

Processing Trivial Diagrams

If necessary (that is, if one doesn't already exist), and the data flow diagram includes at least two processes), create a ``main module'' for the system and give it a name that describes the system's responsibilities. Then, create a module for each process in this (trivial) data flow diagram. Give each of these modules the name of the process it corresponds to, and make each of these modules a child of the main module - again, assuming that the diagram includes two or more processes.

If the diagram includes only one process then things are even simpler: Just create a module for this process and use the process name as the module's name as well. Thus, your ``structure chart'' will consist of only one module, if your ``data flow diagram'' only included one process.

Adding Data Areas and Data Flows

Apply the data design method that's been described to identify a set of ``data tables'' that should correspond to the components of the entity-relationship diagram for the system that is being designed. Draw a ``data area'' at the bottom of the structure chart for each of the data tables you obtained.

Then, add extra ``data areas'' for any data stores on the system's data flow diagrams that didn't correspond to anything on the ERD (but that represented ``registers'' - data areas of bounded size - instead).

For each module, and for each data area, draw an arrow down from the module to the data area if (and only) if the module represents a process that either reads data directly from, or writes data directly to (or causes data to be deleted from) the data store(s) that the data area represents.

Finally, draw in data flows that correspond to data flows between processes on the data flow diagram, or between processes and data stores - but not corresponding to communication with terminators.

For each data flow on the DFD (that should be shown), locate the module or data area corresponding to the ``source'' of the data on the DFD, and locate the module or data area corresponding to the ``destination'' of the data area.

You should discover that there is exactly one (shortest) way to follow connections in the structure chart - backwards, as well as forwards - in order to go from the ``source'' to the ``destination.'' Data may need to flow ``up'' through several modules, from the source, in order to reach a module that's a (lowest) common ``ancestor'' of the source and destination. Then, it will need to flow back ``down'' through several other modules, in order to get to the destination. Add in data signals (or control signals, for control flows) along this (shortest) path of modules, for each of the data flows on the DFD that should be displayed.

Location: [CPSC 333] [Listing by Topic] [Listing by Date] [Previous Topic] [Next Topic] Creation of a ``First Cut'' Structure Chart

Department of Computer Science
University of Calgary

Office: (403) 220-5073
Fax: (403) 284-4707

eberly@cpsc.ucalgary.ca