CPSC 333 --- Lecture 30 --- Wednesday, March 27, 1996 Introduction to Z Reference: J. B. Wordsworth Software Development with Z Addison-Wesley, 1992 Z is a specification language (*not* a programming language) that is based on set theory. Data types are defined (when they're defined at all) using this. Today, Z notation that will be introduced that can be used to define, and specify consistency and correctness conditions, for data stores. The system that's been developed in recent assignments will be used as an ongoing example. No attempt will be made to introduce all, or even most, of the language "Z," in the time that's left. Instead, enough will be introduced to define the system that's to be used as an example and to give an idea of how this language is used (or what it looks like). Wordsworth's book introduces additional parts of the language. Elementary Data Types: Defining Types for Attributes Sets of numbers accepted by Z as basic data types: - The integers, { ..., -3, -2, -1, 0, 1, 2, 3, ...}. As in your discrete math course (I hope!) these are represented using the symbol "Z" written using a "Blackboard Bold" font (so that there's a double stroke used for the diagonal line in the letter) In these notes I'll use "ZZ" to mean the symbol for the integers. - The natural numbers, { 0, 1, 2, 3, ... } *isn't* (strictly speaking) a "type" in Z: It's a subset of the integers, "ZZ". However, this is often used in the same way that a "type" is. This set is represented using the letter"N", again written using a Blackboard Bold font. In these notes I'll use "NN" to mean the symbol for the natural numbers. Z also allows you to introduce "enumerated" types: Types that are finite sets, whose elements (possible values) should be specified when the type is introduced. For example, YesNo ::= yes | no introduces a type whose name is "YesNo," identifies it as an enumerated set, and shows that variables with this type can take on only two possible (distinct!) values: "yes" and "no". According to Wordsworth's book the (combination of) symbol(s) ::= is a *data type definition symbol* As Wordsworth's book notes, this statement is actually "shorthand" for *five* statements: a) A statement in which the type "YesNo" is introduced (as a "given set" --- which I'll get to shortly): [ YesNo ] b) A statement in which "yes" is declared to have type "YesNo": yes : YesNo c) A statement in which "no" is declared to have type "YesNo": no : YesNo d) A statement asserting that "yes" is not equal to "no". This consists of "yes" followed by the usual "not equals" symbol followed by "no". In these notes I'll use "\ne" to stand for the "not equals" symbol, so that I'd type this statement as yes \ne no e) The statement YesNo = { yes, no } which asserts that there aren't any elements of YesNo *except* for "yes" and "no" See page 33 of Wordsworth's book for the above statements written "correctly" (that is, when the character set required by Z is available). Z allows you to define "power sets" --- sets of types --- to define new types from old ones. If T is a type then PP T is also a type --- and includes all the sets whose elements have type T. Here, I'm typing "PP" to replace the symbol "P in Blackboard Bold" (with a double vertical line), which is a standard symbol for "Power Set" For example: If YesNo is defined as above then PP YesNo includes the following sets: - the empty set (written as a large circle with a "/" through it --- one of the "standard" ways to write an empty set) - { yes } - { no } - { yes, no } ... and PP PP YesNo has size *eight*, including - the empty set - { { yes } } - { { no } } - { { yes, no } } - { { yes }, { no } } - { { yes }, { yes, no } } - { { no }, { yes, no } } - { { yes }, { no }, { yes, no }} Note that { yes, no } and { { yes }, { no } } aren't the same things, and don't even have the same type: The first has type PP YesNo and the second has type PP PP YesNo. You can combine the power set with ZZ to obtain a type PP ZZ (which includes all sets of integers), and so on. Z also includes a way to introduce the name of a type without specifying the values that the type can include --- using "given set" notation. To do this, you simply write down the name of the type, enclosed in square brackets. For example, in the example that will follow all this, [ Title ] will be used to declare that "Title" is the name of a type, without saying what values it can include. As described by Wordsworth, and in this course, Z *does not* include - the set of real (or floating point) numbers - the set of character strings as standard types. When we want to introduce a type that will include characters strings, we'll just use the "given set" notation (as shown above, with "Title"). In order to avoid dealing with real numbers at all, I am going to make one *change* to the problem that's to be specified: I am going to assume *when using this as an example involving Z* (and, not for any other reason), that the unit of measure for the attributes "price_paid" and "resale_price" of "books" are *cents* rather than *dollars*. Thus, if "price_paid" has value 100 then that means that the price paid equals $1, while if "price_paid" has value 1 then that means that the price paid is 1 cent. This will allow me to consider price_paid and resale_price to have type ZZ (or even NN) rather than something that Z doesn't support as directly. Now, I'm ready to introduce names for all the attributes of books. In general, I'll either introduce a "type" for an attribute using "given set" notation (when it corresponds to a character string), or I'll introduce the attribute by defining a set of integers. This *set* (which will have type "PP ZZ") will include all the integers that are *legal values* for the attribute. This *particular* example doesn't include any attributes whose types are "enumerated sets", so you won't see this used here --- but it could be, for a different example. As well as a type declaration (or set definition) I will sometimes use one or more additional *predicates* (logical statements) to give more information about what each "type" or "set" can include. --------------------------------------------------------------------- I'll start by defining ISBN_Number to be a set of integers: ISBN_Number : PP ZZ and, I'll add a predicate that identifies the set of "legal" ISBN numbers more precisely: ISBN_Number = { x : ZZ | (0 \le x) \and (x \le 999999999) } I've made two more changes here in order to cope with the fact that this is a plaintext file: I've used "\le" to represent the standard mathematical symbol for "is less than or equal to" and I've used "\and" to represent the standard logical symbol for "and" --- an upside-down V. Later on, I'll likely be using \or and \not and \implies, and \lt, \gt, and \ge, as well. (If you don't see what these mean, read this paragraph over again, and then stop and think.) As mentioned above, I'll use "given set" notation to introduce types for attributes that aren't sets of integers, enumerated sets, etc. [ Title ] (Note: If I'd had several of these to define, then I could have introduced them all at once --- including all the names within the same square brackets, separated by commas.) Now, with the change I've made above, all the remaining attributes of books correspond to sets of nonnegative integers. Price_Paid, Resale_Price : PP NN Number_Ordered, Number_Received, Number_Sold : PP NN Since these all have the same type, I could have defined them all using one (very long) line --- or I could have used five lines, introducing each separately. Easy Exercise: Introduce all the attributes for "Course" and "Publisher" as well. Now I'm ready to define "books". I will want to include the information that, for every book, there is a well-defined value for each attribute, and I'll want to identify "ISBN Number" as a primary key. Z includes *functions* and *relations* as well as sets. A "total function" f from a type X to another type Y is a mapping with the property that for every element x of X, there is exactly one element y of the set Y such that f maps x to y: f(x) = y. A "partial function" f from a type X to another type Y is a mapping with the property that for every element x of X, there is *at most* one element y of Y which f maps x to: either f(x) = y for exactly one y in Y, or f(x) is undefined. A "relation" from X to Y is a set of ordered pairs whose first entries belong to X and whose second entries belong to Y. This is more general than "partial function" because each element x of X could be "mapped to" zero, one, or *several* elements of Y by a relation --- and vice-versa. The set of all books that the system knows about at any given time can be considered to be a *partial function* from ISBN Numbers to the set of ordered tuples whose entries are titles, "price paid"'s, "resale price"'s, etc., respectively. The symbol in Z for a *total function* is a right arrow, just like it is (in function declarations) in Math 271. I'll type this as "\tf". The symbol for "partial function" is almost the same, except that there's a small cross-hatch (vertical line) through middle of the horizonatal line in the arrow. I'll type "\pf" whenever I want to include this symbol. Now, to define "books" to be a partial function that corresponds to what we think "books" ought to be, I'll write books : ISBN_Number \pf ( Title x Price_Paid x Resale_Price x Number_Ordered x Number_Received x Number_Sold ) Here "x" is supposed to be the symbol you use in mathematics to define a set of ordered pairs, rather than the "letter x". I'm breaking the line to fit the entire predicate into the width of this page --- not for any other reason. If your page (or screen) was wide enough, you could write the whole thing in a single line. Similarly --- after you've introduced the attributes for courses and publishers --- you can write courses : ( Discipline_Code x Course_Number) \pf ( Course_Title x Number_of_Students ) publishers : Name \pf ( Street x City x Province x Postal_Code x Telephone_Number ) Now, we need to introduce the relations that were included in the ERD in this system. I'll use functions and relations to do this as well. In order to add predicates specifying correctness and consistency conditions for the system data, I need to remind you of two more things you should have seen already in Math 271, and introduce Z's notation for them. The *domain* of a function f (mapping elements of type X to elements of type Y) is the set of elements x of X such that f(x) is defined. This set is denoted "dom f" in Z. The *range* of a function f is a subset of Y: It's the set of all things in Y that elements of X are *mapped to.* That is, an element y of Y is in the range of f if there exists some element x of X such that f(x) = y. The range of f is denoted "ran f" in Z. Now, since "publications" was a "many-to-one" relation, it can also be introduced as a partial function: publications : ISBN_Number \pf Name ... so, you can think of this as mapping the ISBN Number for a given book to the name of that book's publisher. Every ISBN Number that *does* correspond to a book should have a name mapped to it (and no *other* ISBN Numbers should be in the domain of the partial function "publications"): dom publications = dom books Finally, any name that publications *uses* as the name of a publisher should be in the domain of the partial function "publisher" that I'd already introduced --- but we *didn't* require that every publisher in the system have at least one book: ran publications \subseteq dom publishers Here, I'm typing \subseteq instead of the standard mathematical symbol for "is a subset of or is equal to". Now, "requirements" was a (biconditional) many-to-many relation: I can't use a "partial function" to define it, because each book could be required for *several* courses, and vice-versa. I'll use a "relation" (in the *mathematical* sense) instead: requirements : PP (ISBN_Number x ( Discipline_Code x Course_Number )) Next I need to include predicates to require that this relation "is consistent with" the set of ISBN Numbers that are currently in use, as well as the set of Discipline Code - Course Number pairs that are currently in use to represent courses. Here are two more standard logical symbols I can't include easily in a plaintext file, and that will be useful: The "universal quantifier", "for all": In logic, written as an upside-down A. I'll type "\forall" instead. The "existential quantifier", "there exists": In logic, written as a sideways "E". I'll type "\exists" instead. Another symbol from set theory that I'll need is the symbol for "is an element of" (the greek letter "epsilon"). I'll type "\in" instead. Z adds one more (nonstandard) symbol: A very large, filled-in circle, "\bullet", is included in assertions that use "\forall" or "\exists", just after that quantified variable and its type have been listed, and before the rest of the assertion (about them) has been typed. Finally, Z has an odd way of writing ordered pairs. Instead of using (x, y) to represent an ordered pair with first entry x and second entry y, Z throws away the brackets (which you can add back in, if you want to, to improve readability) and replaces the comma with a special symbol that I'll type as "\mapsto". It looks a lot like the "partial function" symbol --- it's also a right arrow with a small vertical "cross hatch". In the "\mapsto" symbol, the cross hatch is moved farther to the left --- so that it just touches the end of the horizontal line for the arrow, instead of crossing through it. Using all these, I'll now type predicates that state consistency requirements for the "requirements" relation. \forall x : ISBN_Number \bullet ( ( \exists d : Discipline_Code \bullet (\exists n : Course_Number \bullet ( ( x \mapsto ( d \mapsto n ) ) \in requirements ) ) ) \implies ( x \in dom books ) ) This is a formal (and somewhat opaque) way of saying that, if an ISBN Number x is listed in any requirement (so, x is required for something corresponding to some discipline code d and course number n), then there must also be a book with ISBN number x. A similar predicate says that you can't use a discipline code - course number pair in a requirement unless it also represents a course: \forall d : Discipline_Code \bullet ( \forall n : Course_Number \bullet ( ( \exists x : ISBN_Number \bullet ( ( x \mapsto ( d \mapsto n ) ) \in requirements ) ) \implies ( ( d \mapsto n ) \in dom courses ) ) ) Since the "recommendations" data store is very similar, almost the same three declarations-and-predicates are needed for this. recommendations : PP (ISBN_Number x ( Discipline_Code x Course_Number )) \forall x : ISBN_Number \bullet ( ( \exists d : Discipline_Code \bullet (\exists n : Course_Number \bullet ( ( x \mapsto ( d \mapsto n ) ) \in recommendations ) ) ) \implies ( x \in dom books ) ) \forall d : Discipline_Code \bullet ( \forall n : Course_Number \bullet ( ( \exists x : ISBN_Number \bullet ( ( x \mapsto ( d \mapsto n ) ) \in recommendations ) ) \implies ( ( d \mapsto n ) \in dom courses ) ) ) We *didn't* require that every course have required or recommended books. However, we *did* require that every book be either required or recommended for at least one book. \forall x : ISBN_Number \bullet ( ( x \in dom books) \implies ( \exists d : Discipline_Code \bullet ( \exists n : Course_Number \bullet ( ( x \mapsto ( d \mapsto n ) ) \in requirements ) \or ( ( x \mapsto ( d \mapsto n ) ) \in recommendations ) ) ) ) "Exercise:" Read through all this and make sure you see how all these logical predicates specify the consistency and correctness conditions for this system's data stores, provided that "books", "course", and "publishers" are represented as partial functions, and so on. "Next Exercise": In Assignment 4 we added one more set of conditions: We imposed upper bounds on the number of books, courses, publishers, requirements, and recommendations that the system could maintain at any given time. Let's declare these upper bounds: max_books, max_publishers, max_courses, max_requirements, max_recommendations : NN In Z, if S is a (finite) set then we write #S to represent the size of a set. Functions and relations (from X to Y) are thought of as sets: Subsets of the set of *all* ordered pairs from X to Y, X x Y. Write five more *very simple* predicates that specify the conditions that the system doesn't know about more than max_books books, max_courses courses, and so on.