CPSC 333 --- Lecture 30 --- Wednesday, March 27, 1996


                             Introduction to Z

Reference:

  J. B. Wordsworth
  Software Development with Z
  Addison-Wesley, 1992

Z is a specification language (*not* a programming language) that is
based on set theory. Data types are defined (when they're defined at
all) using this.

Today, Z notation that will be introduced that can be used to define,
and specify consistency and correctness conditions, for data stores.
The system that's been developed in recent assignments will be used as
an ongoing example.

No attempt will be made to introduce all, or even most, of the
language "Z," in the time that's left. Instead, enough will be
introduced to define the system that's to be used as an example and to
give an idea of how this language is used (or what it looks
like). Wordsworth's book introduces additional parts of the language.


Elementary Data Types: Defining Types for Attributes

Sets of numbers accepted by Z as basic data types:

 - The integers, { ..., -3, -2, -1, 0, 1, 2, 3, ...}. As in your
   discrete math course (I hope!) these are represented using the
   symbol "Z" written using a "Blackboard Bold" font (so that there's
   a double stroke used for the diagonal line in the letter)

   In these notes I'll use "ZZ" to mean the symbol for the integers.

 - The natural numbers, { 0, 1, 2, 3, ... } *isn't* (strictly speaking)
   a "type" in Z: It's a subset of the integers, "ZZ". However, this
   is often used in the same way that a "type" is. This set is represented
   using the letter"N", again written using a Blackboard Bold font.

   In these notes I'll use "NN" to mean the symbol for the natural
   numbers.

Z also allows you to introduce "enumerated" types: Types that are
finite sets, whose elements (possible values) should be specified when
the type is introduced. For example,

                       YesNo ::= yes | no

introduces a type whose name is "YesNo," identifies it as an
enumerated set, and shows that variables with this type can take
on only two possible (distinct!) values: "yes" and "no".

According to Wordsworth's book the (combination of) symbol(s) ::=
is a *data type definition symbol*

As Wordsworth's book notes, this statement is actually "shorthand"
for *five* statements:

 a) A statement in which the type "YesNo" is introduced (as a "given
    set" --- which I'll get to shortly):

         [ YesNo ]

 b) A statement in which "yes" is declared to have type "YesNo":

         yes : YesNo

 c) A statement in which "no" is declared to have type "YesNo":

         no : YesNo

 d) A statement asserting that "yes" is not equal to "no". This
    consists of "yes" followed by the usual "not equals"
    symbol followed by "no". In these notes I'll use "\ne" to
    stand for the "not equals" symbol, so that I'd type this
    statement as

         yes \ne no

 e) The statement

         YesNo = { yes, no }

    which asserts that there aren't any elements of YesNo *except*
    for "yes" and "no"

See page 33 of Wordsworth's book for the above statements written
"correctly" (that is, when the character set required by Z is
available).

Z allows you to define "power sets" --- sets of types --- to define
new types from old ones. If T is a type then PP T is also a type ---
and includes all the sets whose elements have type T. Here, I'm typing "PP"
to replace the symbol "P in Blackboard Bold" (with a double vertical
line), which is a standard symbol for "Power Set"

For example: If YesNo is defined as above then PP YesNo includes the
following sets:

  - the empty set (written as a large circle with a "/" through it ---
     one of the "standard" ways to write an empty set)

  - { yes }

  - { no }

  - { yes, no }

... and PP PP YesNo has size *eight*, including

  - the empty set

  - { { yes } }

  - { { no } }

  - { { yes, no } }

  - { { yes }, { no } }

  - { { yes }, { yes, no } }

  - { { no }, { yes, no } }

  - { { yes }, { no }, { yes, no }}

Note that { yes, no } and { { yes }, { no } } aren't the same things,
and don't even have the same type: The first has type PP YesNo and
the second has type PP PP YesNo.

You can combine the power set with ZZ to obtain a type PP ZZ (which
includes all sets of integers), and so on.

Z also includes a way to introduce the name of a type without
specifying the values that the type can include --- using "given set"
notation. To do this, you simply write down the name of the type,
enclosed in square brackets. For example, in the example that will
follow all this,

                              [ Title ]

will be used to declare that "Title" is the name of a type, without
saying what values it can include.

As described by Wordsworth, and in this course, Z *does not* include

 - the set of real (or floating point) numbers
 - the set of character strings

as standard types. When we want to introduce a type that will include
characters strings, we'll just use the "given set" notation (as shown
above, with "Title").

In order to avoid dealing with real numbers at all, I am going to make
one *change* to the problem that's to be specified: I am going to
assume *when using this as an example involving Z* (and, not for any
other reason), that the unit of measure for the attributes
"price_paid" and "resale_price" of "books" are *cents* rather than
*dollars*. Thus, if "price_paid" has value 100 then that means that
the price paid equals $1, while if "price_paid" has value 1 then that
means that the price paid is 1 cent. This will allow me to consider
price_paid and resale_price to have type ZZ (or even NN) rather than
something that Z doesn't support as directly.

Now, I'm ready to introduce names for all the attributes of books.
In general, I'll either introduce a "type" for an attribute using
"given set" notation (when it corresponds to a character string), or
I'll introduce the attribute by defining a set of integers. This *set*
(which will have type "PP ZZ") will include all the integers that are
*legal values* for the attribute. This *particular* example doesn't
include any attributes whose types are "enumerated sets", so you won't
see this used here --- but it could be, for a different example.

As well as a type declaration (or set definition) I will sometimes use
one or more additional *predicates* (logical statements) to give
more information about what each "type" or "set" can include.

---------------------------------------------------------------------

I'll start by defining ISBN_Number to be a set of integers:

    ISBN_Number : PP ZZ

and, I'll add a predicate that identifies the set of "legal"
ISBN numbers more precisely:

    ISBN_Number = { x : ZZ | (0 \le x) \and (x \le 999999999) }

I've made two more changes here in order to cope with the fact that
this is a plaintext file: I've used "\le" to represent the standard
mathematical symbol for "is less than or equal to" and I've used
"\and" to represent the standard logical symbol for "and" --- an
upside-down V. Later on, I'll likely be using \or and \not and
\implies, and \lt, \gt, and \ge, as well. (If you don't see what these
mean, read this paragraph over again, and then stop and think.)

As mentioned above, I'll use "given set" notation to introduce types
for attributes that aren't sets of integers, enumerated sets, etc.

   [ Title ]

(Note: If I'd had several of these to define, then I could have
introduced them all at once --- including all the names within the
same square brackets, separated by commas.)

Now, with the change I've made above, all the remaining attributes of
books correspond to sets of nonnegative integers.

   Price_Paid, Resale_Price : PP NN
   Number_Ordered, Number_Received, Number_Sold : PP NN

Since these all have the same type, I could have defined them all
using one (very long) line --- or I could have used five lines,
introducing each separately.

Easy Exercise: Introduce all the attributes for "Course" and
"Publisher" as well.

Now I'm ready to define "books". I will want to include the
information that, for every book, there is a well-defined value for
each attribute, and I'll want to identify "ISBN Number" as a primary
key.

Z includes *functions* and *relations* as well as sets. A "total
function" f from a type X to another type Y is a mapping with the
property that for every element x of X, there is exactly one element y
of the set Y such that f maps x to y: f(x) = y.

A "partial function" f from a type X to another type Y is a mapping
with the property that for every element x of X, there is *at most*
one element y of Y which f maps x to: either f(x) = y for exactly one
y in Y, or f(x) is undefined.

A "relation" from X to Y is a set of ordered pairs whose first entries
belong to X and whose second entries belong to Y. This is more general
than "partial function" because each element x of X could be "mapped
to" zero, one, or *several* elements of Y by a relation --- and
vice-versa.

The set of all books that the system knows about at any given time can
be considered to be a *partial function* from ISBN Numbers to the set
of ordered tuples whose entries are titles, "price paid"'s, "resale
price"'s, etc., respectively.

The symbol in Z for a *total function* is a right arrow, just like it
is (in function declarations) in Math 271. I'll type this as
"\tf". The symbol for "partial function" is almost the same, except
that there's a small cross-hatch (vertical line) through middle of the
horizonatal line in the arrow. I'll type "\pf" whenever I want to
include this symbol.

Now, to define "books" to be a partial function that corresponds to
what we think "books" ought to be, I'll write

 books : ISBN_Number \pf ( Title x Price_Paid x Resale_Price
                            x Number_Ordered x Number_Received
                            x Number_Sold )

Here "x" is supposed to be the symbol you use in mathematics to define
a set of ordered pairs, rather than the "letter x". I'm breaking
the line to fit the entire predicate into the width of this page ---
not for any other reason. If your page (or screen) was wide enough,
you could write the whole thing in a single line.

Similarly --- after you've introduced the attributes for courses and
publishers --- you can write

 courses : ( Discipline_Code x Course_Number) \pf
                             ( Course_Title x Number_of_Students )

 publishers : Name \pf ( Street x City x Province x Postal_Code
                                             x Telephone_Number )

Now, we need to introduce the relations that were included in the
ERD in this system. I'll use functions and relations to do this
as well. In order to add predicates specifying correctness and
consistency conditions for the system data, I need to remind you of
two more things you should have seen already in Math 271, and
introduce Z's notation for them.

The *domain* of a function f (mapping elements of type X to elements
of type Y) is the set of elements x of X such that f(x) is defined.
This set is denoted "dom f" in Z. The *range* of a function f is a
subset of Y: It's the set of all things in Y that elements of X are
*mapped to.* That is, an element y of Y is in the range of f if there
exists some element x of X such that f(x) = y. The range of f is
denoted "ran f" in Z.

Now, since "publications" was a "many-to-one" relation, it can also be
introduced as a partial function:

   publications : ISBN_Number \pf Name

... so, you can think of this as mapping the ISBN Number for a given
book to the name of that book's publisher.

Every ISBN Number that *does* correspond to a book should have a name
mapped to it (and no *other* ISBN Numbers should be in the domain of
the partial function "publications"):

  dom publications = dom books

Finally, any name that publications *uses* as the name of a publisher
should be in the domain of the partial function "publisher" that I'd
already introduced --- but we *didn't* require that every publisher in
the system have at least one book:

 ran publications \subseteq dom publishers

Here, I'm typing \subseteq instead of the standard mathematical symbol
for "is a subset of or is equal to".

Now, "requirements" was a (biconditional) many-to-many relation: I
can't use a "partial function" to define it, because each book could
be required for *several* courses, and vice-versa. I'll use a
"relation" (in the *mathematical* sense) instead:

  requirements : PP (ISBN_Number x ( Discipline_Code x Course_Number ))

Next I need to include predicates to require that this relation "is
consistent with" the set of ISBN Numbers that are currently in use,
as well as the set of Discipline Code - Course Number pairs that are
currently in use to represent courses.

Here are two more standard logical symbols I can't include easily in a
plaintext file, and that will be useful:

 The "universal quantifier", "for all": In logic, written as an
 upside-down A. I'll type "\forall" instead.

 The "existential quantifier", "there exists": In logic, written
 as a sideways "E". I'll type "\exists" instead.

Another symbol from set theory that I'll need is the symbol for "is an
element of" (the greek letter "epsilon"). I'll type "\in" instead.

Z adds one more (nonstandard) symbol: A very large, filled-in circle,
"\bullet", is included in assertions that use "\forall" or "\exists",
just after that quantified variable and its type have been listed, and
before the rest of the assertion (about them) has been typed.

Finally, Z has an odd way of writing ordered pairs. Instead of using
(x, y) to represent an ordered pair with first entry x and second
entry y, Z throws away the brackets (which you can add back in, if you
want to, to improve readability) and replaces the comma with a special
symbol that I'll type as "\mapsto".  It looks a lot like the "partial
function" symbol --- it's also a right arrow with a small vertical
"cross hatch". In the "\mapsto" symbol, the cross hatch is moved
farther to the left --- so that it just touches the end of the
horizontal line for the arrow, instead of crossing through it.

Using all these, I'll now type predicates that state consistency
requirements for the "requirements" relation.

 \forall x : ISBN_Number \bullet
   ( ( \exists d : Discipline_Code \bullet
       (\exists n : Course_Number \bullet
          ( ( x \mapsto ( d \mapsto n ) ) \in requirements ) ) )
     \implies
     ( x \in dom books ) )

This is a formal (and somewhat opaque) way of saying that, if an ISBN
Number x is listed in any requirement (so, x is required for
something corresponding to some discipline code d and course number
n), then there must also be a book with ISBN number x.

A similar predicate says that you can't use a discipline code - course
number pair in a requirement unless it also represents a course:

 \forall d : Discipline_Code \bullet
    ( \forall n : Course_Number \bullet
         ( ( \exists x : ISBN_Number \bullet
               ( ( x \mapsto ( d \mapsto n ) ) \in requirements ) )
           \implies
           ( ( d \mapsto n ) \in dom courses ) ) )

Since the "recommendations" data store is very similar, almost the
same three declarations-and-predicates are needed for this.

  recommendations : PP (ISBN_Number x ( Discipline_Code x Course_Number ))

  \forall x : ISBN_Number \bullet
   ( ( \exists d : Discipline_Code \bullet
       (\exists n : Course_Number \bullet
          ( ( x \mapsto ( d \mapsto n ) ) \in recommendations ) ) )
     \implies
     ( x \in dom books ) )

 \forall d : Discipline_Code \bullet
    ( \forall n : Course_Number \bullet
         ( ( \exists x : ISBN_Number \bullet
               ( ( x \mapsto ( d \mapsto n ) ) \in recommendations ) )
           \implies
           ( ( d \mapsto n ) \in dom courses ) ) )

We *didn't* require that every course have required or recommended
books. However, we *did* require that every book be either required or
recommended for at least one book.

 \forall x : ISBN_Number \bullet
   ( ( x \in dom books)
     \implies
     ( \exists d : Discipline_Code \bullet
         ( \exists n : Course_Number \bullet
           ( ( x \mapsto ( d \mapsto n ) ) \in requirements )
           \or
           ( ( x \mapsto ( d \mapsto n ) ) \in recommendations ) ) ) )

"Exercise:" Read through all this and make sure you see how all these
logical predicates specify the consistency and correctness conditions
for this system's data stores, provided that "books", "course", and
"publishers" are represented as partial functions, and so on.

"Next Exercise": In Assignment 4 we added one more set of conditions:
We imposed upper bounds on the number of books, courses, publishers,
requirements, and recommendations that the system could maintain at
any given time.

Let's declare these upper bounds:

 max_books, max_publishers, max_courses, max_requirements,
                                           max_recommendations : NN

In Z, if S is a (finite) set then we write #S to represent the size of
a set. Functions and relations (from X to Y) are thought of as sets:
Subsets of the set of *all* ordered pairs from X to Y, X x Y.

Write five more *very simple* predicates that specify the conditions
that the system doesn't know about more than max_books books,
max_courses courses, and so on.