THE ASSIGNMENT ....

Provide a solver for sudoku puzzles!

DISCUSSION:

Sudoku is a deceptively innocent looking puzzle and the first thing you should realize is that hiding underneath the surface is a graph coloring problem: these are well-known NP-complete problems. So you have to treat these puzzle with a little more respect than you might think. Sudoku puzzles are graded according to difficulty and this grading is essentially giving you an anatomy of the problem: many puzzles are very easy to solve (grade: easy, moderate) with a bit of logic.

However, when one gets to the harder puzzles (grade: hard, very hard, evil) all the simple methods fail.

SUGGESTIONS FOR HOW TO SOLVE IT ...

There are two basic methods:

- Look up a standard graph coloring algorithm and use it!
- Build an ad hoc sudoku oriented solution.

The second method, a special purpose solution, which I will sketch below (and which you should follow), approaches the problem much as a human would. In particular it allows one to grade the complexity of problems. There are two components of this solution:

A. Constrain propagation:

The idea here is that you associate to
each position on the board those values which that place can
take - or cannot take.
If it is in the same row, column, or square as a
given (or calculated) value then that value is not possible. One
can be a little clever than this by considering what values it cannot be given the arrangements
in the other squares. Then one can start being much
much cleverer: there are, in fact, all sorts of clever reasons why one
can
work out the value of a square given a board configuration.
However, don't get sucked in to this black hole! It
can be a lot of fun finding new clever special case rules but you must
ask yourself "how many cases will this actually catch?" Look on
the sudoku sites for some
basic situations to catch ... if you find yourself going beyond these
you are probably going too far!

Your aim is at this stage is to get the obvious constrains implemented and -- very importantly -- to discover whether a board configuration is not possible as soon as possible. While working out the constraints three thing can happen:

In the first case you are done! In the second case you want to continue to apply the constraints until there is nothing new to discover. In the last case you have to resort to guessing possible values ...

B. Search:Your aim is at this stage is to get the obvious constrains implemented and -- very importantly -- to discover whether a board configuration is not possible as soon as possible. While working out the constraints three thing can happen:

(a) you can actually complete the board
and get a solution

(b) you can discover that the board
configuration is not possible

(c) (most likely in hard puzzles) you
obtain a partially filled in
configuration with constraints.

In the first case you are done! In the second case you want to continue to apply the constraints until there is nothing new to discover. In the last case you have to resort to guessing possible values ...

The last stage is a search phase.
There are two basic methods: depth-first or breadth-first
search. Depth-first methods are better as there is no
sharing of board configurations and this search method require
considerably less storage. The basic idea is that you
keep
an agenda (=list) of partial solutions and you repeatedly develop
the "best" or "most promising" partial solution on the agenda.
The development involves working out the constraints on the board (if
the configuration is impossible move onto the next item on the agenda)
once one has updated the constraints replace the board configuration by
a list of possible configurations where you have guessed the possible
values of a promising position (usually one with the least number of
possibilities). These guesses go back on the agenda.

How big is the search? In most cases the search tree (if you do the constraints well!) will only be two or three steps. For a puzzle to be manageable by hand this is a requirement! In general, a rough estimate of the maximal depth of the tree (number of guesses you make) is about 10 (you will start with about 20 places given and by the time thirty have been filled in the constraints will force an answer). That said there are going to be really bad cases ...

A measure of the difficulty of the puzzle is the number of guesses you have to make: so you can actually grade the puzzles yourself!

HOW FAST SHOULD YOUR PROGRAM BE ....How big is the search? In most cases the search tree (if you do the constraints well!) will only be two or three steps. For a puzzle to be manageable by hand this is a requirement! In general, a rough estimate of the maximal depth of the tree (number of guesses you make) is about 10 (you will start with about 20 places given and by the time thirty have been filled in the constraints will force an answer). That said there are going to be really bad cases ...

A measure of the difficulty of the puzzle is the number of guesses you have to make: so you can actually grade the puzzles yourself!

Haskell, as provided by Hugs, is not
wildly efficient and certainly you are not expected to get into bit
manipulating tricks for the problem representation (which if you wanted
it to fly would be necessary) ... so while the program need not be
instantaneous on a 9 by 9 sudoku
board you should be able to get the solution in a couple of
seconds!

ENJOY!

- Takayuki Yato: Complexity and Completeness of finding another solution and its application to puzzles. Master's Thesis, University of Tokyo, 2003.
- Bertram Felenhauer aand Frazer Jarvis: There are 6670903752021072936960 Sudoku grids (here).
- Gordon Royle: Minimum Sudoku (here).
- Angus Johnson: Solving Sudoku
(here).