SUDOKU

THE ASSIGNMENT ....

Provide a solver for sudoku puzzles!

DISCUSSION:

Sudoku
is a deceptively innocent  looking puzzle and the first thing you should realize is that  hiding underneath the  surface is a graph coloring problem: these are well-known NP-complete problems.  So you have to treat these puzzle with a little more respect than you  might think.    Sudoku puzzles are graded according to difficulty and this grading is essentially giving you an anatomy of the problem:  many puzzles are very easy to  solve (grade: easy, moderate) with a bit of logic.
However, when one gets to the harder puzzles (grade: hard, very hard, evil) all the simple methods fail.

SUGGESTIONS FOR HOW TO SOLVE IT ...
There are two basic methods:
1. Look up a standard graph coloring algorithm and use it!
2. Build an ad hoc sudoku oriented solution.
The first method involves the translating sudoku into a form in which a graph coloring algorithm can be applied.   The downside is that when the problem is translated  into a general graph coloring problem all the special properties of a sudoku board are lost and so one cannot take special advantage of the geometry.  The upside is that general graph coloring algorithms can be pretty fast and in developing a special purpose solution one could well miss some of these basic techniques.

The second method, a special purpose solution, which I will sketch below (and which you should follow), approaches the problem much as a human would.  In particular it allows one to grade the complexity of problems.  There are two components of this solution:

A. Constrain propagation:
The idea here is that you associate to each position on the board those values which that place can take - or cannot take.   If it is in the same row, column, or square as a given (or calculated) value then that value is not possible.  One can be a little clever than this by considering what values it cannot be given the arrangements in the other squares.   Then one can start being much much cleverer: there are, in fact, all sorts of clever reasons why one can work out the value of a square given a board configuration.   However, don't get sucked in to this black hole!    It can be a lot of fun finding new clever special case rules but you must ask yourself "how many cases will this actually catch?"  Look on the sudoku sites for some basic situations to catch ... if you find yourself going beyond these you are probably going too far!

Your aim is at this stage is to get the obvious constrains implemented and -- very importantly -- to discover whether a board configuration is not possible as soon as possible.   While working out the constraints three thing can happen:
(a) you can actually complete the board and get a solution
(b) you can discover that the board configuration is not possible
(c) (most likely in hard puzzles) you obtain a partially filled in configuration with constraints.

In the first case you are done!  In the second case you want to continue to apply the constraints until there is nothing new to discover.  In the last case you have to resort to guessing possible values ...

B. Search:
The last stage is a search phase.  There are two basic methods: depth-first or breadth-first search.   Depth-first methods are better as there is no sharing of board configurations and this search method require considerably less  storage.   The basic idea is that you keep an agenda (=list) of partial solutions and you  repeatedly develop the "best" or "most promising" partial solution on the agenda.  The development involves working out the constraints on the board (if the configuration is impossible move onto the next item on the agenda) once one has updated the constraints replace the board configuration by a list of possible configurations where you have guessed the possible values of a promising position (usually one with the least number of possibilities).  These guesses go back on the agenda.

How big is the search?  In most cases the search tree (if you do the constraints well!) will only be two or three steps.   For a puzzle to be manageable by hand this is a requirement!  In general, a rough estimate of the maximal depth of the tree (number of guesses you make) is about 10 (you will start with about 20 places given and by the time thirty have been filled in the constraints will force an answer).  That said there are  going to be really bad cases ...

A measure of the difficulty of the puzzle is the number of guesses you have to make: so you can actually grade the puzzles yourself!

HOW FAST SHOULD YOUR PROGRAM BE ....

Haskell, as provided by Hugs, is not wildly efficient and certainly you are not expected to get into bit manipulating tricks for the problem representation (which if you wanted it to fly would be necessary) ... so while the program need not be instantaneous on a 9 by 9 sudoku board you  should be able to get the solution in a couple of seconds!

ENJOY!
OTHER REFERENCES (thanks to Sean Nichols,Vladimir Sedach, Andrew Seniuk)
• Takayuki Yato: Complexity and Completeness of finding another solution and its application to puzzlesMaster's Thesis, University  of Tokyo, 2003.
• Bertram Felenhauer aand Frazer Jarvis: There are 6670903752021072936960 Sudoku grids (here).
• Gordon Royle: Minimum Sudoku (here).
• Angus Johnson: Solving Sudoku (here).