The constraints, formally
A Sudoku puzzle is a partial function f: {1..9} × {1..9} → {1..9} (cells to digits) such that the unique completion satisfies:
- Row constraint: ∀r, the multiset
{f(r,c) | c ∈ {1..9}}={1, 2, 3, 4, 5, 6, 7, 8, 9}. - Column constraint: ∀c, the multiset
{f(r,c) | r ∈ {1..9}}={1, 2, 3, 4, 5, 6, 7, 8, 9}. - Box constraint: ∀ box B (one of the nine 3×3 sub-grids),
{f(r,c) | (r,c) ∈ B}={1, 2, 3, 4, 5, 6, 7, 8, 9}. - Uniqueness: exactly one valid completion exists.
The first three are constraint satisfaction. The fourth (uniqueness) is what distinguishes a well-formed Sudoku puzzle from a partially-filled grid that happens to be solvable.
Solvers: constraint propagation + backtracking
Modern fast Sudoku solvers use a two-stage strategy:
Stage 1: constraint propagation
For each empty cell, maintain a candidate set — the digits that could legally go there. Apply rules iteratively until no further deductions are possible:
- Naked single: a cell with only one candidate can be filled.
- Hidden single: a digit that can only legally go in one cell of a row, column, or box can be placed there even if the cell has other candidates.
- Naked pair: two cells in the same row/col/box with identical 2-candidate sets — eliminate those two digits from elsewhere in the group.
- Pointing pair / box-line reduction: if a digit’s candidates within a box all sit on a single row/col, eliminate from that row/col outside the box.
- X-wing, swordfish, jellyfish: 2-row, 3-row, 4-row patterns that confine a digit’s candidates to a small set of columns.
- XY-wing, XYZ-wing: 3-cell chains exploiting bivalue/trivalue cells.
Propagation reduces candidates without guessing. For Easy and most Medium puzzles, this stage solves the puzzle entirely.
Stage 2: backtracking
When propagation stalls (no further deductions possible) but the puzzle isn’t solved, the solver picks the cell with the fewest candidates and tries each in turn:
function solve(grid):
if grid is complete: return grid
cell = pickCellWithFewestCandidates(grid)
for each candidate value in cell.candidates:
place value in cell
propagate constraints (Stage 1 again)
if no contradiction:
result = solve(grid)
if result: return result
undo placement and propagation
return null # no valid completion
The “fewest candidates” cell choice is a most-constrained-variable heuristic from constraint satisfaction theory. Picking randomly works but is slower; picking the most-constrained cell minimizes the search tree.
Performance
Modern hand-rolled backtracking solvers solve 99% of human-difficulty Sudoku in under 1 millisecond on commodity hardware. The hardest known 9×9 puzzles (constructed adversarially) take seconds to tens of seconds even for optimized solvers. Our daily Sudoku generator at sudoku.html runs the solver in browser JavaScript and generates puzzles in 100–1000ms.
Generators: how to create a new puzzle
Generation is harder than solving because you’re searching for a puzzle with both a unique solution and a target difficulty. Standard approach:
Step 1: generate a complete valid grid
Run a randomized backtracking fill on an empty grid. Each cell is assigned a random digit from its candidate set; backtrack on contradictions. This produces a uniformly-random valid 9×9 Sudoku grid in microseconds.
Step 2: remove cells, checking uniqueness
Pick cells in random order. For each cell, temporarily remove its value and run the solver to count solutions. If the puzzle still has exactly one solution, accept the removal. If multiple solutions exist after removal, restore the cell.
solution = generateRandomGrid()
puzzle = copyOf(solution)
positions = shuffle(allCellPositions)
for (r, c) in positions:
removed = puzzle[r][c]
puzzle[r][c] = 0
if countSolutions(puzzle, limit=2) > 1:
puzzle[r][c] = removed # undo
The countSolutions call has a limit parameter — we stop counting at 2 because we only need to know “is it unique or not.” This shortcut dramatically reduces generation time vs. counting all solutions.
Step 3: difficulty tuning
Difficulty correlates with two things:
- Clue count: fewer clues generally = harder. Easy ~38 clues blank, Hard ~56 blank.
- Required techniques: Easy = scanning + naked singles only. Medium = + hidden singles. Hard = + pencil marks + naked pairs + X-wings.
Most generators target a clue count and call it done. Better generators run the propagation-only solver after each removal: if the propagation-only solver can’t solve the resulting puzzle, the puzzle is harder than the propagation rules permit, and you classify it accordingly.
Sudoku is NP-complete (with caveats)
Generalized Sudoku — the n²×n² version where each row, column, and n×n box must contain digits 1 through n² — is NP-complete. Proved in 2003 by Yato and Seta via reduction from Latin Square Completion (which Colbourn proved NP-complete in 1984).
The reduction:
- Take any instance of Latin Square Completion (a partially-filled n×n Latin Square; decide if it can be completed).
- Embed it as the upper-left n×n region of an n²×n² Sudoku, with all other cells initialized to enforce that the only valid Sudoku completion corresponds to a valid Latin Square completion.
- The construction takes polynomial time. Therefore Sudoku ≥ Latin Square Completion in complexity.
Caveat about 9×9. The classic 9×9 Sudoku has a constant-size input (always 81 cells), so it’s technically solvable in O(1) by lookup — you’d need a database of all 6.67×10²¹ possible grids, but it’s constant. The NP-completeness result applies to the parameterized n²×n² family.
In practice this distinction doesn’t matter. The algorithms that solve 9×9 are the same backtracking algorithms that solve n²×n². The complexity result tells us that fundamentally fast (polynomial-time) algorithms don’t exist for the general case, even though brute-force is fine for 9×9.
The 17-clue minimum proof
For decades, Sudoku enthusiasts asked: what’s the smallest number of clues a unique-solution Sudoku can have? Empirical answer was 17 — people had constructed many 17-clue puzzles — but no known 16-clue example existed.
In 2014, Gary McGuire, Bastian Tugemann, and Gilles Civario proved formally that 17 is the minimum. Method: exhaustive computer search.
- For each of the 5,472,730,538 essentially-different 9×9 Sudoku solutions, they checked every possible 16-clue subset (there are
C(81, 16) ≈ 3.3 × 10¹&sup4;per solution). - For each subset, they tested whether the puzzle had a unique solution.
- None did.
The proof took approximately 7.1 million CPU-hours — about 800 CPU-years — on a Stokes supercomputer cluster at the Irish Centre for High-End Computing.
The proof has been independently verified. It is now the minimum-clue Sudoku result.
How many Sudoku grids exist? (6.67 × 10²¹)
In 2005, Bertram Felgenhauer and Frazer Jarvis enumerated all valid 9×9 Sudoku grids:
N = 6,670,903,752,021,072,936,960
Their method:
- Fix the top band (rows 1, 2, 3). The number of valid top bands is small enough to enumerate directly:
9! × 56 × 6&sup6;= a few million arrangements after symmetry reduction. - For each top band, count the number of ways to complete it to a full grid.
- Use Burnside’s lemma to count by symmetry classes.
Result verified by independent groups in 2006. The count of essentially-different grids (after factoring out the 9! × 6&sup8; = 3,359,232 trivial symmetries: digit relabelings, row/column permutations within bands/stacks, full-grid transposition) is approximately 5,472,730,538 — about 5.5 billion.
Sudoku and Latin Squares: the relationship
A 9×9 Latin Square is a 9×9 grid where each row and column contains the digits 1–9 exactly once. There are 5.524 × 10²⁷ valid 9×9 Latin Squares.
A Sudoku grid is a Latin Square that also satisfies the box constraint. So Sudoku grids are a subset of Latin Squares: about 6.67 × 10²¹ of 5.5 × 10²⁷, or 1 in 800,000.
This ratio matches intuition: the box constraint is a strong filter. About 99.9999% of Latin Squares fail the box constraint somewhere.
Symmetry: what makes two Sudoku grids “the same”
Several operations preserve Sudoku validity:
- Digit relabeling: swap any two digits everywhere (1↔5 turns every 1 into a 5 and vice versa).
- Row permutation within a band: rows {1,2,3} can be reordered; same for {4,5,6} and {7,8,9}.
- Column permutation within a stack: columns {1,2,3} can be reordered; same for {4,5,6} and {7,8,9}.
- Band swap: the three bands can be reordered (rows {1,2,3} ↔ {4,5,6}, etc.).
- Stack swap: same for columns.
- Transposition: rows↔columns.
Two grids related by any composition of these are “essentially the same.” The factor of 9! × 6&sup8; / 2 = ~1.2 billion symmetries reduces the 6.67×10²¹ count to ~5.5 billion essentially-different grids.
Variants: Killer, Hyper, Diagonal, Samurai
- Killer Sudoku: classic constraints + cage constraints (cells in a marked region must sum to a given total). NP-complete by reduction from classic Sudoku.
- Hyper Sudoku: classic + four extra 3×3 boxes overlapping with the main 3×3 grid. Doesn’t change complexity class.
- Diagonal Sudoku (X-Sudoku): classic + both main diagonals must contain 1–9. Strictly more constrained, so fewer valid grids exist.
- Samurai Sudoku: five 9×9 grids overlapping in a plus-shape pattern. Each grid’s corner box is shared. Solving requires propagation across the shared boxes.
- 16×16 Sudoku (Hexadoku): n=4 generalization, digits 0–9+A–F. NP-complete.
- 4×4 Sudoku (kids): n=2, digits 1–4. Trivially solvable in microseconds; useful for teaching kids the mechanic.
Practical generator: Puzzle Cottage’s implementation
Our daily Sudoku generator (in JavaScript, in sudoku.html):
- Seed: use the date as a deterministic seed via LCG (
s = (1664525s + 1013904223) mod 2³²) so every player worldwide gets the same daily puzzle. - Generate complete grid: randomized backtracking fill of an empty 9×9.
- Remove cells: shuffle 81 positions; for each, try removing while validating uniqueness. Continue until target blank-count is reached or no further cell can be removed.
- Difficulty target: Easy = 38 blanks, Medium = 48 blanks, Hard = 56 blanks. Three independent puzzles per day, one per level.
Generation runs in 100–1000ms in the browser, cached in localStorage so the user only pays the cost once per puzzle. Source: sudoku.html.