CPSC411: COMPILER CONSTRUCTION I
Lecturer: Robin Cockett
Notes
2010
COURSE OUTLINE:
- Overview of the tasks of a compiler: lexical analysis, parsing,
semantic
analysis, code generation.
- Lexical analysis: regular languages and automata
- Context free grammars: LL(n), LR(n), LALR(1).
- Attribute grammars: inherited and synthesized attributes,
circularity.
- Compiling to a stack machine.
Who should take this course? Have you ever wondered how a
compiler
works? This course is an introduction to compiler
construction which concentrates on the front-end issues: parsing,
generating the
appropriate
internal data structures (abstract sysntax tree), and basic "ad hoc"
code
generation. The problem of parsing input led to
one of the most developed subjects in computer science:
the theory of context free languages. The parsing algorithms
which resulted for context free languages are useful far beyond
the specific task of generating a compiler.
These techniques provide the basis for the front-end of
all systems. Modern compilers and other systems use software
tools for facilitating the
construction
of these front-ends. The course will introduce the basic tools of
Lex and Yacc and end by building a compiler (to stack machine code) for
a simple language.
This course is a prerequisite for the main compiler course: CPSC610:
Compiler Code Generation and Optimization. This course is
centered round the
optimizations which are performed in the "back-end" of a compiler and a
project
in which one builds a significant component of a
compiler/system.
LECTURE NOTES (as they become available):
- Introduction
to
compilers
- Grammars
and
parse
trees
- Recursive
descent
parsing .... here is a
simple expression parser as a Haskell program.
- Recursive descent
grammars
- Introduction
to
LL(1)
parsing
- LL(1)
grammars
- Grammar
transformation
- Regular
languages
- LR(1)
grammars
- LALR(1) grammars (LALR(1)
follow sets)
- Runtime
organization and AM
- Symbol table
- Plumbing
diagrams
ASSIGNMENTS:
Assignments may be written in a language of your choice. The
choice must be discussed with your TA and the language must
support LEX and YACC (an LALR(1) parser generator tool). Your
assignment marks will depend on
- on correctness and completeness of your code;
- clear succinct documentation;
- your coding techniques;
- your performance on test examples and the test examples you
provide.
The main language of instruction will be the functional programming
language Haskell.
Discussion of comparison between implementation techniques in different
paradigms will take place in the labs and students are encouraged to
share their observations. Here are some possible languages and
the available tools:
Language:
|
Haskell
|
C
|
Java
|
Python
|
LEX
|
Alex
|
Lex/Flex
|
Jlex
|
lex.py
|
YACC
|
Happy
|
Yacc/Bison
|
JCup
|
yacc.py
|
Assignments for 2012:
- Due January 27: Using
lex here (3%)
- Due Febuary 24: Recursive
descent parsing and generating basic stack code here (10%)
- Due March 17: Using Lex
and Yacc (description of project here
with
the extended version here: tests here) (7%)
- Due April 7: Symbol
table
and semantic analysis (the description of the intermediate
representation is here: tests here) (10%).
- Due April 14: Stack
based code generation (the description of the AM machine is here) (10%).
Past assignments here.
EXAMS:
- March:
Midterm (25%) past exams are here:
this
is
mostly
on
context
free
languages
(if
you
have
not
done
so
already
check
out
the
Context
Free
Grammar Checker).
- April: Final (35%)
past exams are here.
LAB. NOTES:
Please find the tutorial notes here:
- Cheng Xu, Thomas Burt notes
for 2010 here.
- Andrew Kuiper's notes for 2009 here.
- David Aikema's notes for 2008 here.
- Ning Tangs notes for 2007 here.
- Karel Bermannn's notes for 2008 here.
- Brett Giles's notes for 2003 here
- mostly Python.
POLICIES:
Late policy: for each 24 hours late the maximum mark
you
can obtain will be reduced by 15%. After three days we will
not accept a late submissions.
Cheating: Any cases will automatically receive the following
treatment :
- [First offense:] You will receive a letter notifying you that you
were
involved in such behavior. This letter will be kept in your
file for the remainder of your stay at the university.
- [Second offense:] You will be subject to disciplinary action:
usually
removal
from your program.
My main concern is that you should be learning in this class if you are
cheating you are not. Furthermore, you may be setting an
unfortunate
trend for your future work habits. I regard it as an integral part of
my
job as a teacher to ensure that you approach your work with the right
attitude
and will muster all the resources at my disposal to achieve this.
When you hand in work to be graded you are saying that it is your
work.
For programs this means that you typed in the program, developed it,
and
debugged it yourself. If you use routines from a book or
another
source you must clearly document your source. Furthermore, before
handing in some code, you should spend a moment to consider how you had
arrived at that code. If you realize that some of the code which
should have been yours is not then this is the time to come and talk to
me or the TA before you get into trouble. This allows us to
apportion credit appropriately.
Discuss the course material! I strongly encourage you
to
discuss all aspect of the course material! Form study groups for
the exams. Your peers are probably your best learning
resource.
When you discuss programming projects do so "off line" and away from
the
terminals: a good idea is to do so on a blackboard which you clean
carefully
after the discussion. The golden rule is: after you
have finished such a discussion you must walk away only with what
is
in your head!
See the honest in academics website here.
Enjoy the course.
TEXT
- (Required) Compiler construction: principles and practice.
K.C.
Louden.
- (Recommended - and a useful book) Lex & Yacc. J.R. Levine, T.
Mason,
D. Brown.
- (Recommended) Compilers: principles techniques, and tools.
Aho,
Sethi,
Ullman.
(Course notes for 2009 here
by James Snell)
PARSING LINKS:
- A web based tool for
investigating
context
free
grammars
(developed
by
Mike
Burrell
a
former
CPSC411
student!).
It
allows
you
to:
transform
grammars
into
LL(1) (when possible) to look
at LL(1), LR(0), SLR(1), LALR(1), and LR(1) parsing tables and
automata.
It also has examples of context free grammars belonging to most of the
different classes considered in the course.
- A downloadable tool for (Early) parsing called Kakuy
(it
is
free
and
for
windows
platforms).
- Practical
guide to
parsing
by Dick Grune and Ceriel Jacobs.
- Essays by F.D. Lewis on Recursive
descent , top-down
parsing, and building
a
top
down
parser. and bottom
up
parsers. These are simple overviews of these parsing ideas.
- Java versions of Lex and Yacc the Java
CUP systems.
- A catalog of lexer
and parser generators.
- The ANTLR system: ANother
Tool for
Language Recognition which has become very popular recently: it uses
LL(k)
grammars.
- Yapps
is a
python
based parser generator.
- Dr.
Seuss
on
Parser
Monads: well some good fun for those who are interested
in
functional languages.
- Happy a parser
generator
for Haskell.
OTHER LINKS:
- The Haskell webpage.
- Yet another nice tutorial on
Haskell by Hal Daume (recommended by Karel).