CPSC521:  Foundations of Functional Programming

Author: Robin Cockett
Date: 14th Oct. 2008



WHY  FUNCTIONAL PROGRAMMING?



It is reasonable to ask why you should learn yet another programming language.  For example you may feel that you are already an excellent C++ programmer and there will be nothing to be gained by learning yet another programming language.  In order, to compare functional programming with the imperative style of programming let consider a little example (essentially from the Haskell home page):



Quicksort in Haskell

qsort []     = []
qsort (x:xs) = (qsort  [y | y <- xs, y < x]) ++ [x] ++ (qsort [y | y <- xs, y >= x])

Quicksort in C

qsort( a, lo, hi ) int a[], hi, lo;
{
int h, l, p, t;

if (lo < hi) {
l = lo;
h = hi;
p = a[hi];

do {
while ((l < h) && (a[l] <= p))
l = l+1;
while ((h > l) && (a[h] >= p))
h = h-1;
if (l < h) {
t = a[l];
a[l] = a[h];
a[h] = t;
}
} while (l < h);

t = a[l];
a[l] = a[hi];
a[hi] = t;

qsort( a, lo, l-1 );
qsort( a, l+1, hi );
}
}



It should be clear that the Haskell code is considerably more compact.  This in itself is an advantage as it reduces the shear volume of code which has to be produced.  It also can make the intention of the code much clearer and provide the programmer with a clearer view of how to think about solving problems.  However, something has been lost: in particular the C code sorts the "list" in place (in an array) so that the C code is more space efficient ... clearly if this is a crucial issue then there is a compelling reason to program in C.  For most modern programming applications (on modern machines), however, one only expects "reasonable" space efficiency and often what Haskell offers will suffice.

The advantages of programming in a functional language such as Haskell are:
Functional programming allows you to focus on what is to be computed rather than the intimate details of how it is to be computed.  It often characterized as specification level programming. However, it is still very much programming! Writing good functional code is just as demanding as writing good imperative code.


Further reading in this area is the classic paper of John Hughes: why functional programming matters.

Functional languages are type safe: this means that if your program passes the type checking stage that there will be no core dumps or lost pointers.  Passing the type checking stage does not mean that your program is correct: it may not do what you intended or, worse, it may never terminate. A non-terminating program sometimes becomes apparent from its storage profile: a program which starts using excessive amounts of storage and eventually runs out of memory may well have a recursion with no basis.   Thus, if your program takes an unexpectedly long time to run you should be suspicious that something is wrong!  

Haskell is the leading lazy functional language.  A variety of application packages have been written for it ... which hopefully we will explore during the course of the semester.

One important advantage of a language with a solid theoretical basis is that its programs can be manipulated in provably correct  ways.  Thus, program transformation and optimization is facilitated by an underlying well-defined semantics (which many programming languages out there simply do not have).  The Glasgow Haskell Compiler (GHC) actually uses a number of very sophisticated optimizations which would be unthinkable for lower level language.   The effect of this is that while traditionally functional languages are regarded as being very inefficient, in fact, Haskell code (as delivered by GHC) has an overall efficiency which is now reckoned to be second only to C! 

Functional languages are based on the lambda calculus: during the course of the semester we will explore the theory of this calculus by writing Haskell programs to implement various aspects of this theory.  One particularly important aspect of the use of the lambda calculus in practice is the Hindley-Milner type system.  One of the main exercises will be to develop a parametric type checker for the typed lambda calculus.  




GETTING STARTED IN HASKELL: HUGS or GHCi



Your first task is to get started  in  Haskell:  I am going to  cover the  start of the text by  Hudak in  short order and you are  required to read along and fill out the lecture material as we go -- as you should have already had considerable programming experience: the introductory chapters should be relatively straightforward to follow.   There are various other introductory sources in particular the gentle guide to Haskell 98 may be useful to you.

We shall be using the HUGS system or GHCI (the Glasgow Haskell Compiler Interpreter) which should be available to you on any of the Computer Science systems.   If you have any difficulty let me and the software staff know.   HUGS is a Haskell interpreter which works by loading files ( the :load filename command).  The way to work with the system is to have an editor window open at the same time as a HUGS window and to gradually build up your file testing all the little programs as you write them by loading them into Haskell and testing them.   A "control D" gets you out of the HUGS or GHCI system.

When you start up the HUGS or GHCI system it reads in the standard prelude.   It is worth looking at this file as it tells you what functions are already available but also gives some good examples of how to write Haskell code: you will find the file at /usr/share/hugs/lib/Prelude.hs ... the standard ending for Haskell program files is ".hs ".  As the HUGS system reads your programs from a file, the order in which you declare your programs within that file is not important: the system collects all your definitions and sorts out the dependencies.  This means that you can write your programs in a human readable order.


Starting  programs - using patterns:



Perhaps one of the most  delightful aspects of functional programming is use of patterns to express what must be done.  To  illustrate this consider the problem of appending two lists:
 
Lists are defined in the prelude and take one of two forms : the empty list  [] or  a non-empty list  x:xs where x is the head of the list (an element of the list) and xs  is the tail and is a  list.  The append function  can then be written in the following way:
append:: ([a],[a]) -> [a]
append([],ys) = ys
append(x:xs,ys) = x:(append(xs,ys))


This definition starts with a specification of the type of the append function: the way I have written append here takes in a pair of lists of some type a (a variable type) and produces  a single list of the same type.  The next two lines specify the function: this is done by considering the two possibilities for the first argument and describing what happens in each case.  This is often called a definition by structural recursion as we break down the structure of the first argument.   When the first argument is the empty list the function returns the  second argument, while if the first argument is a non-empty list we must recurse on the tail of the list in the expected manner.

The system evaluates this function by trying each pattern in turn until it finds the first one which matches.   It then applies the transformation suggested by the RHS of that pattern.  This does mean that the order in which patterns are written is important.

Functional programs are all about using (defined) functions within other functions.  Here is a little example:  to reverse a list we can use the append function above:
reverse:: [a] -> [a]
reverse [] = []
reverse (x:xs)  = append(reverse xs,[x])


This is an O(n^2) reverse (naive reverse) but it does express clearly the intention of reversing a list!  Here is an O(n) reverse:

rev:: [a] -> [a]
rev xs = rev1 (xs, [])
    where rev1 ([], ys) = ys
               rev1 (x:xs, ys) = rev1 (xs,x:ys)   

This code is marginally less clear: note the use of a local definition to introduce the second argument.  More on reverse!


Curried programs:



I have written all the above programs in an uncurried form: thus, if a function needs two arguments  I have passed them in as a tuple.  However, functional languages are better suited to defining functions in a curried form.   This means that one would tend to define append as follows:

append:: [a] ->  [a] -> [a]
append [] ys = ys

append (x:xs) ys = x:(append xs ys)

where the type  [a] ->  [a] -> [a]  (which should be read as [a] ->  ([a] -> [a]) -- the association is to the right) says the function takes in an element of [a] and produces a function [a] -> [a].   This means we should read append x y as (append x) y with association to the left!  Intially these conventions may seem a little confusing but they are conventions which are well-established and they do, overall, reduce the need for parentheses.  However, this does mean that when you intend to first apply f to x then g to the result you must use parentheses (see also the composition notation) to get the desired effect:
g (f x)  not g f x = (g f) x.
Using the curried form does mean that the function with just one argument applied is a valid term:

append [1,2,3]:: [Integer] -> [Integer]

notice here the type is specialized as the first argument determines that the general type must be an integer.



Here is a program listing for the programs above and some others I discussed in class.


Datatypes:


One of the most convenient features of a functional programming language is its datatype definitions.  These allow one to declare complex datatypes efficiently and to use them in a very convenient way.  We have already used the list datatype so let us illustrate how one can write an insertion into a binary search tree.  First it is necessary to define the datatype of binary search trees:
data  Ord a => STree a
      = Tip
       | Snode (STree a,a,STree a)
Here we have actually also required that the elements in the binary search tree must admit an order.   Tip and Snode are called constructors. Their types are as follows:
tip:: () -> STree(a)
Snode:: (STree a,a,STree a) -> STree a
This allows us to define an insertion as follows:

insert:: Ord a => a ->  (STree a) -> (STree a)
insert a Tip = Snode(Tip,a,Tip)
insert a (Snode(t1,b,t2))
              | a < b            =  Snode(insert a t1, b, t2)
              | a = b            =  Snode(t1,b,t2)
              | a > b            =  Snode(t1,b,insert a t2)


Notice that we may use the pattern matching for arbitrary datatypes by indicating the constructor involved.   Notice also we have used guards in this example to separate the cases.

Here is a program listing for an implementation of red/black trees.



Here is a program listing for the programs above and some others I discussed in class.


Haskell Classes:


A particularly nice feature of Haskell is the ability to define ad hoc polymorphic operators.  This allows overloading of function symbols and a uniform notation for common operations such as comparisons (Ord class), equality test (Eq class), and printing (Show class).  This facility is extensively used in Haskell.  Here is the definition of the equality class:
class Eq a where
   (/=),  (==) :: a -> a -> bool
   x /= y = not(x == y)
This says that any type belonging to the equality class has defined two functions called (\=) and (==) and, furthermore there is a default definition of (\=) in terms of equality (==) .   Here is an example of how we add an instance of the equality class:

instance (Eq a) => Eq (STree a) where
      Tip == Tip = True
       SNode (t1,a,t2) == SNode(s1,b,s2) = (t1 == t2) && (a == b) && (t2 == s2)

This says that STree a is a member of the equality class provided a is.   Furthermore equality is calculated as indicated in the code.

Once one has the equality class one can write functions which apply to any type in that class: for example sorting algorithms will apply to types which are in the Ord class:
sort::  Ord a  =>  [a] -> [a]
Here is a little example of how one might write the set equality predicate:
seteq:: Eq a => [a] -> [a] -> bool
seteq xs ys = (subset xs ys) && (subset ys xs) where
      subset [] ys = True
      subset (x:xs) ys = (member x ys) && (subset xs ys) where
                member x [] = False
                member x (y:ys)
                             | x == y  = True
                             | otherwise = member x ys

Now this function can be greatly improved if one has ordered lists: try to write a more efficient set equality test under the assumption that the lists are always sorted:
setordeq:: Ord a => [a] -> [a] -> bool

Higher-order classes:

Haskell supports higher order classses these allow one to  make a class which covers  a family of type constructors.  For example lists and trees both allow a map function: rather than invent a new name for mapping every time one uses a new datatype one can introduce a class to catch all the instances:

class Functor m where
         fmap:: (a -> b) -> m a -> m b

where if one wants to create an instance for mapping lists one writes;

instance Functor [] where
         fmap:: (a -> b) -> [a] -> [b]
         fmap f [] = []
         fmap f (x:xs) = (f x):(fmap f xs)

while for trees defined as follows (with all their information at their leaves):

data Tree a = Leaf a
                    | Branch (Tree a) (Tree a)

one has the following instance:

instance Functor Tree where
            fmap: (a -> b) -> Tree a -> Tree b
            fmap f (Leaf x) = Leaf (f x)
            fmap f (Branch t1 t2) = Branch (fmap f t1) (fmap f t2)
          

List comprehension:


Another nice facility that Haskell has, which was originally introduced into Miranda by David Turner, is list comprehension.  As we shall discover this is linked to another important concept namely monads .  Haskells I/O is largely managed through monads (monadic I/O) and more generally there is a do syntax associated with monads.  We will come to this later.

Here is an example of its use:  suppose we want to find all the Pythagorean triples less than 100 then we could write:
pythag: Integer ->  [(Integer,Integer,Integer)]
pythag n = [ (x,y,z) | x <- [1..n], y <- [1..n], z <- [1..n], pyth (x,y,z)] where
                  pyth (x,y,z) = x * x + y * y == z * z 

It really makes the  code close to a specification.  But what is happening  under the hood?  List comprehensions are translated  out by the  system  using the following  steps:
[ e | ]  =  [ e ]
[ e | x <-  xs , r] =  flattten  (map  f xs) where
                 f x =  [ e | r ]
[ e | p, r ]  = if p then [ e | r ] else []
List comprehensions allow the very succinct and straight forward definition of the quicksort algorithm (above):
qs: Ord a => [a] -> [a]
qs [] = []
qs (a:as) = (qs [x| x <- as, x<a]) ++ [a] ++ (qs [x| x <- as, x>as])

Exercise: Can you  write a  Haskell program to find all the irreducible Pythagorean triples (i.e. those where the three numbers have highest common multiple 1).  Try to make it efficient!   Here is a suggested solution ..

 

Monads:


As mentioned above Haskell's list comprehension is a specialization of the more general monad facilities of Haskell.  Monads, in fact, are an essential part of the Haskell language as all IO is managed through the IO-monad. 

So what are monads?   Monads are, in fact, many things!  In Haskell a monad is introduced as a higher-order class:

class Monad t where
       return:: a -> T a
       (>>=) :: T a -> (a -> T b) -> T b

Haskell presents a monad essentially as a Kleisli triple (a presentation due to Ernie Manes) these consist of three things:
Here is the list monad as an instance of the Monad class:
instance Monad [ ] where
          return x = [x]
          [
] >>= q = []
         
(x:xs) >>= q = (q x)++(xs >>= q)

There is a special "do" syntax for monads which makes them much more convenient to use:
do { r } = r
do { x <- p ; C ; r } = p >>= q   where q x = do { C ; r }
do { r' ; C ; r } = r' >>= \_ -> ( do { C ; r } )
Here is the quicksort function written using this syntax:
qs [] = []
qs (x:xs) = (qs (do xl <- xs
                              if x1 < x then return x1 else []))
                  ++ [x] ++
                  (qs (do x2 <- xs
                              if x2 >= x then return x2 else [])
)

Here is an example of the use of the exception monad for evaluating based numbers.  The difficulty with conventional code is that, if the base is octal, and the digits 8 or 9 are used then we must raise an exception which has to be propagated throughout the program.  In conventional code this would mean that one would have to have code to propagate the exception in the correct manner.   The code using the "do" syntax hides this nicely.

The above example is nice except for the fact that Haskell already has an exception handling capability in the error facility.  The example of the state monad and the parser monad are more convincing.   Here is a small example of a parser monad and here for the state monad.



Monad identities:


In order to be a "reasonable" monad  (i.e one which behaves in a reasonable manner)  the functions involved in the definition of a monad should satisfy various identities.   First, if T is a monad notice that we can define
instance Monad T => Functor T where
fmap:: (a  -> b) -> (T a -> T b)

fmap  f  p  =   p >>= (\x.return (f x))
so that every monad is necessarily in the functor class.   Furthermore we can define another important map:
mult :: Monad T => T(T a) -> T a
mult xx = xx >>= \x.x
The unit for the multiplication is the return:
unit :: Monad T => a -> T a
unit = return
with respect to these maps the following identities must hold:
mult . mult = mult . fmap mult :: T(T(T a)) -> T a
mult . unit = mult . fmap unit = \x.x :: T a ->  T a
The standard presentation of a monad is in fact the one given above:  a type constructor with a unit and a multiplication satisfying the above identities.  Notice that the first identity is essentially an "associativity" condition while the second set of identities express the fact that this multiplication has a "unit".

Both the multiplication and the unit transformation must be natural (or parametric).  This means that, for example, when g::a -> b :

fmap g . unit = unit . g :: a -> T b

More generally if T and S are type constructors belonging to the functor class and g:: T a -> S a then g behaves parametrically or naturally in case the following two compositions of programs are equal:
                       g
          T t  ------------> S t
             |                       |
fmap h  |                       |  fmap h
             v                      v
           T t' ------------> S t'
                        g
So that g . fmap h = fmap h . g.   An example to make you realize that this is quite natural is given by setting T and S to be the list type constructor, [_], and g to be the reverse function then the above square becomes:
                   reverse
          [t] --------------> [t]
             |                       |
fmap h  |                       |  fmap h
             v                      v
           [t'] --------------> [t']
                    reverse
and expresses the equality reverse . fmap h = fmap h . reverse which with a little thought is most certainly correct.

Haskell's monads:

There is another important presentation of a monad called the Kleisli triple presentation.  Haskell uses a minor modification of this presentation:
return = unit
p >>= q = mult (mapf  q  p)

and the identities can be re-expressed using this form as:
(\x -> unit x >>=  (\x.x)) = (\x.x >>= unit) = (\x.x) :: T a  ->  T a
(p >>= q) >>= r = p >>= (\x . q x >>= r) :: T c

The original Kleisli presentation:

The original Kleisli triple presentation uses a slightly different form (due to Ernie Manes although it does not bear his name!):
return = unit :: a -> T a
#:: (a -> T b) -> (T a -> T b)
The second function is called the Kleisli  "lifting"  (note how it "lifts" the domain of a function).   This allows yet another presentation of the monad identities which is a little more transparent:
#( return ) = id
#( f ) . return = f
#( g ) . #( f ) = #( #( g ) . f)
One can then prove that, for the type constructor T of the monad, the definitions:
 fmap h := #(return . h)
mult := #(id)
give a monad in the sense described above.  

This last presentation is a little more transparent about what it is happening.  A monad, in the sense we give below, is changing the "composition"  operation from
(.) :: (b -> c) -> (a -> b) -> (a -> c)
to the more involved
(.)'::(b -> T c) -> (a -> T b) -> (a -> Tc)
g .'  f = #(g) . f
as composition occurs throughout a program this has a dramatic effect on the meaning of the program!  In particular it allows the modeling of a number of computational effects using monads (non-determinism, exceptions, state,...).  This is also why monads seem, at first, a bit like magic ...