Skip to content

Instantly share code, notes, and snippets.

@btpnlsl
Forked from cellularmitosis/Makefile
Last active January 26, 2025 20:48
Show Gist options
  • Save btpnlsl/8195b7af314cb6d27bf11fa53752179f to your computer and use it in GitHub Desktop.
Save btpnlsl/8195b7af314cb6d27bf11fa53752179f to your computer and use it in GitHub Desktop.
A trivial sudoku solver, in C

A trivial sudoku solver, in C

Update: this is a bit better at solving sudoku as it can match tuples and use that to eliminate possibilities which might cause a deadlock.

This is an extension to the sudoku solver written by Jason Pepas which adds some code to match simple tuples by row/col/grid.

It has not been optimized for performance but it can handle some puzzles which deadlocked the original code.

a.out: sudoku.c
gcc -std=c89 -Wall -Werror -O3 sudoku.c
clean:
rm -f a.out
.PHONY: clean
100080400
403600005
000100069
020953000
000802007
000000000
050008004
040500030
210000070
#include <stdlib.h>
#include <stdint.h>
#include <stdio.h>
#include <string.h>
#include <assert.h>
#include <stdbool.h>
/* A sudoku solver. */
/*
Copyright (c) 2020 Jason Pepas
Released under the terms of the MIT license.
See https://opensource.org/licenses/MIT
*/
/*
Board file format: plain text, 9 lines of 9 numbers (zero for empty box).
Example:
530070000
600195000
098000060
800060003
400803001
700020006
060000280
000419005
000080079
*/
/*
Data structures:
- A 'box' is a set of the remaining possible solutions for a spot in the grid.
This set is represented as a uint16_t bitvector, with the sudoke numbers 1-9
represented as the first 9 bits in the bitvector.
- A solved box has only one bit set.
- A 'board' is a 2-dimensional array (9x9) of boxes (bitvectors).
- A solved board is a 9x9 grid of bitvectors which each have only one item.
*/
/*
The general approach is to iterate over the board and use logic to reduce
the number of possible solutions in each box until the board is solved.
Algorithm:
- For each row, build a set of the solved numbers in that row, then
remove those solutions from the remaining possibilities in each of the
unsolved boxes in that row.
- Do the same for each column.
- Do the same for each 3x3 grid.
- Repeat the above process until either the board is solved, or until
deadlock is detected (i.e. the board state remains unchanged after an
iteration).
*/
/*
These definitions are here to make the code more readable and to minimize
the use of magic numbers. You can't really change them without altering the
nature of the puzzle.
*/
#define BOARD_SIZE 9
#define GRID_SIZE 3
#define TAB_SIZE 8
/*
A note on coordinates and C array indexing:
The row/col and grid/idx coordinates which are used in the code below in the range [1-9].
Methods which access g_board will translate these variabiles into C 0-based indexes.
Use some typedefs here to try to keep variable types straight. Ideally these types would not
be interchangable except in situations where there are explicit methods to convert between
them.
Row, Col, Grid & Idx are types of Coords. Box is a type of BitVec.
*/
typedef int Coord;
typedef Coord Row;
typedef Coord Col;
typedef Coord Grid;
typedef Coord Idx;
typedef int Num;
typedef uint16_t BitVec;
typedef BitVec Box;
/* the board. access to the board should be confined to methods which translate between 1-indexed row/col or
grid/idx coordinates and C 0-indexes. */
Box g_board[BOARD_SIZE][BOARD_SIZE];
/* print traces during iteration. TODO: make this a command line flag */
bool g_trace = true;
/* some auditing for memory allocations / frees for detecting memory leaks */
uint32_t g_malloc_counter = 0;
uint32_t g_free_counter = 0;
void* dbg_malloc(size_t size) {
g_malloc_counter++;
return malloc(size);
}
void dbg_free(void* ptr) {
g_free_counter++;
assert(g_free_counter <= g_malloc_counter);
free(ptr);
}
void dbg_alloc_check() {
if (g_trace && (g_malloc_counter != g_free_counter)) {
fprintf(stderr, "!!! malloc counter: %d, free counter: %d !!!\n", g_malloc_counter, g_free_counter);
}
assert(g_malloc_counter == g_free_counter);
}
/* methods for translating between row/col coordinates and grid/idx coordinates. all coordinates are 1-indexed [1..9] */
Row get_row_for_grid(Grid grid, Idx index) {
assert(grid > 0 && grid <= BOARD_SIZE);
assert(index > 0 && index <= BOARD_SIZE);
return (((grid - 1) / GRID_SIZE) * GRID_SIZE) + ((index - 1) / GRID_SIZE) + 1;
}
Col get_col_for_grid(Grid grid, Idx index) {
assert(grid > 0 && grid <= BOARD_SIZE);
assert(index > 0 && index <= BOARD_SIZE);
return (((grid - 1) % GRID_SIZE) * GRID_SIZE) + ((index - 1) % GRID_SIZE) + 1;
}
Grid get_grid_for_coords(Row row, Col col) {
assert(row > 0 && row <= BOARD_SIZE);
assert(col > 0 && col <= BOARD_SIZE);
return (((row - 1) / GRID_SIZE) * GRID_SIZE) + ((col - 1) / GRID_SIZE) + 1;
}
Idx get_idx_for_coords(Row row, Col col) {
assert(row > 0 && row <= BOARD_SIZE);
assert(col > 0 && col <= BOARD_SIZE);
return (((row - 1) % GRID_SIZE) * GRID_SIZE) + ((col - 1) % GRID_SIZE) + 1;
}
/**
* Methods which directly access the global board are here.
*/
/* set all 9 bits in this box. */
void board_set_all(Row row, Col col) {
assert(row > 0 && row <= BOARD_SIZE);
assert(col > 0 && col <= BOARD_SIZE);
Box all_nums = /* 0b0000000111111111 */ 0x1FF;
g_board[row-1][col-1] = all_nums;
}
Box board_get_box(Row row, Col col) {
assert(row > 0 && row <= BOARD_SIZE);
assert(col > 0 && col <= BOARD_SIZE);
return g_board[row-1][col-1];
}
/* convert a number to a bitvector. */
BitVec num_to_bitvec(Num num) {
assert(num > 0 && num <= BOARD_SIZE);
return 0x1 << (num-1);
}
/* add num to the box's bitvector. */
void board_set_num(Row row, Col col, Num num) {
assert(row > 0 && row <= BOARD_SIZE);
assert(col > 0 && col <= BOARD_SIZE);
assert(num > 0 && num <= BOARD_SIZE);
g_board[row-1][col-1] |= num_to_bitvec(num);
}
/* returns true if the number is set for the specified box/bitvec */
bool box_is_num_set(Box box, Num num) {
assert(num > 0 && num <= BOARD_SIZE);
return (box & num_to_bitvec(num));
}
/* returns true if the box in the specified row/col is a subset of the specified bitmask */
bool box_is_subset(Row row, Col col, BitVec bitmask) {
assert(row > 0 && row <= BOARD_SIZE);
assert(col > 0 && col <= BOARD_SIZE);
return (g_board[row-1][col-1] & ~bitmask) == 0;
}
/* count the number of bits which are set in the bitvector. */
int count_set_bits(Box box) {
int i;
int set_bits_count = 0;
for (i = 0; i < 16; i++) {
if (box & /* 0b1 */ 0x1) {
set_bits_count++;
}
box = (box >> 1);
}
return set_bits_count;
}
/* subtract the possibilities in the specified bitvec from a box in the board */
void board_subtract_box(Row row, Col col, Box box) {
assert(row > 0 && row <= BOARD_SIZE);
assert(col > 0 && col <= BOARD_SIZE);
g_board[row-1][col-1] &= ~(box);
assert(count_set_bits(g_board[row-1][col-1]) > 0);
}
/**
* End of methods which directly access the board.
*/
/* is the box solved? */
bool box_is_solved(Row row, Col col) {
assert(row > 0 && row <= BOARD_SIZE);
assert(col > 0 && col <= BOARD_SIZE);
Box box = board_get_box(row, col);
int bitcount = count_set_bits(box);
/* a box is solved if there is only 1 possibility for the row/col */
return bitcount == 1;
}
/* is the board solved? */
bool board_is_solved() {
Row row;
Col col;
for (row = 1; row <= BOARD_SIZE; row++) {
for (col = 1; col <= BOARD_SIZE; col++) {
if (!box_is_solved(row, col)) {
return false;
}
}
}
return true;
}
/* gets a string representation of the numbers which are possibilites for a row/col */
char* box_get_str(Box box) {
Num num;
uint8_t bitcount = count_set_bits(box);
char* box_str = dbg_malloc(sizeof(char) * ((bitcount * 2) + 1));
char* pbox_str = box_str;
bool addSpace = false;
for (num = 1; num <= BOARD_SIZE; num++) {
if (box_is_num_set(box, num)) {
if (!addSpace) {
addSpace = true;
} else {
*pbox_str++ = ' ';
}
*pbox_str++ = (char)(num + '0');
}
}
*pbox_str++ = '\0';
return box_str;
}
/* return the character representation of the box. */
char* board_get_str(Row row, Col col) {
assert(row > 0 && row <= BOARD_SIZE);
assert(col > 0 && col <= BOARD_SIZE);
Box box = board_get_box(row, col);
char* box_str = box_get_str(box);
if (!box_is_solved(row, col)) {
char* bracket_box_str = dbg_malloc(sizeof(char) * (strlen(box_str) + 3));
sprintf(bracket_box_str, "[%s]", box_str);
dbg_free(box_str);
return bracket_box_str;
}
else {
return box_str;
}
}
/* print the board state. prints the possibilities for unsolved boxes in braces (i.e. [1 3 7]). */
void print_board() {
Row row;
Col col;
int ntabs;
char* s_board[BOARD_SIZE][BOARD_SIZE] = {0};
/* there is an attempt here to print the board in a way that aligns the boxes into a more readable format.
to do so it's necessary to know the maximum size of the text for any box so all columns have the same size.
TODO: add an outline (-, |) around the grids, but doing so will mess up the tab spacing
*/
for (row = 1; row <= BOARD_SIZE; row++) {
for (col = 1; col <= BOARD_SIZE; col++) {
s_board[row-1][col-1] = board_get_str(row, col);
}
}
/* calculate the maximum string size for any cell in order to calculate an appropriate number of tabs */
size_t max_slen = 0;
for (row = 1; row <= BOARD_SIZE; row++) {
for (col = 1; col <= BOARD_SIZE; col++) {
size_t slen = strlen(s_board[row-1][col-1]);
if (slen > max_slen) {
max_slen = slen;
}
}
}
int max_tabs = max_slen / TAB_SIZE;
for (row = 1; row <= BOARD_SIZE; row++) {
for (col = 1; col <= BOARD_SIZE; col++) {
char* box_str = s_board[row-1][col-1];
printf("%s", box_str);
/* add tabs so each box is lined up */
for (ntabs = strlen(box_str) / TAB_SIZE; ntabs <= max_tabs; ntabs++) {
putchar('\t');
}
dbg_free(box_str);
}
putchar('\n');
}
}
/* print the board if tracing is enabled */
void tprint_board() {
if (g_trace) {
print_board();
}
}
/* a struct which pairs a box with an coordinate in a row/col/grid */
typedef struct {
Box box;
Coord coord;
} BoxAndCoord;
/* helper method to allocate and initialize a new box/coord pair */
BoxAndCoord* new_box_and_coord(Box box, Coord coord) {
BoxAndCoord* box_and_coord = dbg_malloc(sizeof(BoxAndCoord));
box_and_coord->coord = coord;
box_and_coord->box = box;
return box_and_coord;
}
/* free an array of n BoxAndCoord pointers and the underlying BoxAndCoord structs */
void free_box_and_coord_array(BoxAndCoord** arr, int n) {
int i;
for (i = 0; i < n; i++) {
dbg_free(arr[i]);
}
dbg_free(arr);
}
/* a struct for tracking a tuple with its union box & coord bitvec */
typedef struct {
Box box;
BitVec coord_bitvec;
} Tuple;
/* helper method to allocate and initialze a new tuple from an array of box/coord pairs of size k*/
Tuple* new_tuple(BoxAndCoord* boxes[], int k) {
int i;
Tuple* tuple = dbg_malloc(sizeof(Tuple));
memset(tuple, 0, sizeof(Tuple));
for (i = 0; i < k; i++) {
tuple->box |= boxes[i]->box;
tuple->coord_bitvec |= num_to_bitvec(boxes[i]->coord);
}
return tuple;
}
/* a struct to keep track of all tuples which have been found when scanning a row/col/grid */
typedef struct {
int count;
Tuple** tuples;
} TupleCounter;
/* helper method to allocate a new tuple counter */
TupleCounter* new_tuple_counter() {
TupleCounter* counter = dbg_malloc(sizeof(TupleCounter));
memset(counter, 0, sizeof(TupleCounter));
return counter;
}
/* frees the memory allocated for a tuple counter and all associated tuples */
void free_tuple_counter(TupleCounter* counter) {
int i;
if (counter->count > 0) {
for (i = 0; i < counter->count; i++) {
dbg_free(counter->tuples[i]);
}
dbg_free(counter->tuples);
}
dbg_free(counter);
}
/* creates a new tuple counter which includes the specified tuple */
TupleCounter* counter_plus_tuple(
TupleCounter* counter,
Tuple* tuple
) {
int i;
TupleCounter* new_counter = new_tuple_counter();
/* initialize count & alloc memory for tuple pointer array */
new_counter->count = counter->count + 1;
new_counter->tuples = dbg_malloc(sizeof(Tuple*) * (counter->count + 1));
/* copy over existing tuple array */
for (i = 0; i < counter->count; i++) {
new_counter->tuples[i] = counter->tuples[i];
}
/* add the pointer for the new tuple to the end of the array */
new_counter->tuples[new_counter->count - 1] = tuple;
/* free the memory used by the old counter, but not the memory used by the tuples in the old counter's tuple array */
if (counter->count > 0) {
dbg_free(counter->tuples);
}
dbg_free(counter);
return new_counter;
}
/* print a message about finding a tuple if tracing is enabled */
void trace_tuple_found(
char* ordinal,
Coord coord,
Box box,
char* antiordinal,
BitVec match_bitvec
) {
if (g_trace) {
char* box_str = box_get_str(box);
char* match_str = box_get_str(match_bitvec);
printf("found a tuple in %s %d for [%s] in %s: %s\n", ordinal, coord, box_str, antiordinal, match_str);
dbg_free(box_str);
dbg_free(match_str);
}
}
/* print a message about finding possibilities in a row/col which are exclusive to a grid */
void trace_exclusive_found(
char* ordinal,
Coord coord,
Box box,
Grid grid
) {
if (g_trace) {
char* box_str = box_get_str(box);
printf("found a tuple in %s %d for [%s] exclusive to grid %d\n", ordinal, coord, box_str, grid);
dbg_free(box_str);
}
}
/* prints all the tuples found while scanning a row/col/grid */
void print_tuple_counter(
TupleCounter* counter,
char* ordinal,
Coord coord,
char* anitordinal
) {
int i;
for (i = 0; i < counter->count; i++) {
Tuple* tuple = counter->tuples[i];
trace_tuple_found(ordinal, coord, tuple->box, anitordinal, tuple->coord_bitvec);
}
}
/* returns true if the array of unsolved boxes contains a tuple of size k */
bool boxes_are_tuple(BoxAndCoord* boxes[], int k) {
int i;
Box union_box = 0;
for (i = 0; i < k; i++) {
assert(count_set_bits(boxes[i]->box) > 1);
union_box |= boxes[i]->box;
}
/* a set of boxes is a tuple if the count of bits in the union of all bitvector is equal to the number of boxes */
return count_set_bits(union_box) == k;
}
/* a recursive method to scan all combinations of size K in an array of unsolved box/coords of size N. if a tuple is matched return a tuple counter with the new tuple */
TupleCounter* recurse_tuple_combinations(
TupleCounter* counter,
BoxAndCoord* arr[],
int n,
int k,
BoxAndCoord* combo[],
int index,
int start
) {
int i;
assert(n > 2);
assert(k < n && k >= 2);
assert(index >= 0 && index <= k);
assert(start >= 0);
if (index == k) {
/* found a combo of unsolved box/indexes of size k */
if (boxes_are_tuple(combo, k)) {
/* the combo is a tuple! */
Tuple* tuple = new_tuple(combo, k);
return counter_plus_tuple(counter, tuple);
}
return counter;
}
for (i = start; i <= n - (k - index); i++) {
combo[index] = arr[i];
counter = recurse_tuple_combinations(counter, arr, n, k, combo, index + 1, i + 1);
}
return counter;
}
/* find all tuples in an array of unsolved box/coords for a row/col/grid */
TupleCounter* find_all_tuples(
BoxAndCoord* arr[],
int n
) {
int k;
assert(n > 2);
TupleCounter* counter = new_tuple_counter();
/* look for tuples of size [2 .. n-1] in the combinations of unsolved box/coords for the row/col/grid */
for (k = (n - 1); k >= 2; k--) {
BoxAndCoord* combo[k];
counter = recurse_tuple_combinations(counter, arr, n, k, combo, 0, 0);
}
return counter;
}
/* return the total count of all remaining possibliities in every box.
a solved board has a possibilities count of 81. */
int possibilities_count() {
Row row;
Col col;
int pcount = 0;
for (row = 1; row <= BOARD_SIZE; row++) {
for (col = 1; col <= BOARD_SIZE; col++) {
Box box = board_get_box(row, col);
int bitcount = count_set_bits(box);
pcount += bitcount;
}
}
return pcount;
}
/* read the board in from a file. */
void load_board(char* fname) {
Row row;
Col col;
FILE* f = fopen(fname, "r");
assert(f != NULL);
const int bufsize = 110; /* 9+2 per line * 9 lines + 1 null */
char buf[bufsize];
size_t count = fread(buf, 1, bufsize, f);
assert(count >= 90);
fclose(f);
int i = 0;
for (row = 1; row <= BOARD_SIZE; row++) {
for (col = 1; col <= BOARD_SIZE; col++) {
assert (i < bufsize);
char ch = buf[i];
if (ch == '0') {
board_set_all(row, col);
} else {
Num num = ch - '0';
board_set_num(row, col, num);
}
i++;
}
/* skip over the newline. */
char ch = buf[i];
while (ch == '\n' || ch == '\r') {
i++;
ch = buf[i];
continue;
}
}
}
/* return the combined bitmask of all solved boxes in the row. */
BitVec get_row_solved_bitmask(Row row) {
assert(row > 0 && row <= BOARD_SIZE);
Col col;
BitVec bitmask = 0;
for (col = 1; col <= BOARD_SIZE; col++) {
if (box_is_solved(row, col)) {
bitmask |= board_get_box(row, col);
}
}
return bitmask;
}
/* remove solved bits from a row */
void subtract_solved_from_row(Row row) {
assert(row > 0 && row <= BOARD_SIZE);
Col col;
BitVec solved = get_row_solved_bitmask(row);
for (col = 1; col <= BOARD_SIZE; col++) {
if (!box_is_solved(row, col)) {
/* subtract the solved bits from this box. */
board_subtract_box(row, col, solved);
}
}
}
/* get the unsolved box & coords for a specified row */
BoxAndCoord** get_row_unsolved_boxes(Row row, int nunsolved) {
int i;
Col col;
BoxAndCoord** arr = dbg_malloc(sizeof(BoxAndCoord*) * nunsolved);
i = 0;
for(col = 1; col <= BOARD_SIZE; col++) {
if (!box_is_solved(row, col)) {
arr[i] = new_box_and_coord(board_get_box(row, col), col);
i++;
}
}
assert(i == nunsolved);
return arr;
}
/* find tuples in a row and subtract them from any unsolved boxes which aren't part of the tuple */
void match_tuples_row(Row row) {
assert(row > 0 && row <= BOARD_SIZE);
int i;
Col col;
int nunsolved = BOARD_SIZE - count_set_bits(get_row_solved_bitmask(row));
/* there must be 3 or more unsolved boxes to do a meaningful check. */
if (nunsolved > 2) {
BoxAndCoord** arr = get_row_unsolved_boxes(row, nunsolved);
/* find tuples of size k in the different combinations of the unsolved columns array */
TupleCounter* counter = find_all_tuples(arr, nunsolved);
free_box_and_coord_array(arr, nunsolved);
if (counter->count > 0) {
print_tuple_counter(counter, "row", row, "col");
/* subtract tuples from other columns in the row */
for (i = 0; i < counter->count; i++) {
Tuple* tuple = counter->tuples[i];
for (col = 1; col <= BOARD_SIZE; col++) {
if (!box_is_num_set(tuple->coord_bitvec, col)) {
board_subtract_box(row, col, tuple->box);
}
}
}
}
/* free tuple counter and memory allocated to track unsolved boxes */
free_tuple_counter(counter);
}
}
/* subtract possibilities exclusive to a row from the other boxes in a grid */
void subtract_exclusive_from_grid_except_row(
Box exclusive_box,
Grid grid,
Row row
) {
assert(grid > 0 && grid <= BOARD_SIZE);
assert(row > 0 && row <= BOARD_SIZE);
Num num;
for (num = 1; num <= BOARD_SIZE; num++) {
Row r = get_row_for_grid(grid, num);
if (r != row) {
Col c = get_col_for_grid(grid, num);
if (!box_is_solved(r, c)) {
/* board_subtract_box(r, c, exclusive_box); */
}
}
}
}
/* find possibilities in a row which are exclusive to a single grid,
then subtract those possibilities from the other boxes in the grid */
void match_grid_exclusive_in_row(Row row) {
assert(row > 0 && row <= BOARD_SIZE);
Col col;
int i, j;
Box box_by_grid[GRID_SIZE] = {0};
BitVec solved = get_row_solved_bitmask(row);
/* create a union bitvec for all the possibilities for each grid in the row */
for (col = 1; col <= BOARD_SIZE; col++) {
if (!box_is_solved(row, col)) {
int index = (col - 1) / GRID_SIZE;
assert(index >= 0 && index < GRID_SIZE);
box_by_grid[index] |= board_get_box(row, col);
}
}
for (i = 1; i <= GRID_SIZE; i++) {
/* iterate through each grid in the row and bucket the bitvecs to either the current
grid or the union of the possibilites in all the other grids */
Box grid_box = box_by_grid[i - 1];
Box union_box = 0;
for (j = 1; j <= GRID_SIZE; j++) {
if (j != i) {
union_box |= box_by_grid[j - 1];
}
}
/* possibilities in the row are exclusive to the grid if they are not present in any of
the other grids in the row, and the possibility is not already solved */
Box exclusive = (grid_box &= ~(union_box | solved));
if (count_set_bits(exclusive) > 0) {
Grid grid = (((row - 1) / GRID_SIZE) * GRID_SIZE) + ((i - 1) % GRID_SIZE) + 1;
trace_exclusive_found("row", row, exclusive, grid);
subtract_exclusive_from_grid_except_row(exclusive, grid, row);
}
}
}
/* iterate over the rows, reducing possibilities. */
void iterate_rows(int pass) {
Row row;
for (row = 1; row <= BOARD_SIZE; row++) {
subtract_solved_from_row(row);
/* ignore the first pass */
if (pass > 1) {
match_tuples_row(row);
match_grid_exclusive_in_row(row);
}
}
}
/* return the combined bitmask of all solved boxes in the column. */
BitVec get_col_solved_bitmask(Col col) {
assert(col > 0 && col <= BOARD_SIZE);
Row row;
BitVec bitmask = 0;
for (row = 1; row <= BOARD_SIZE; row++) {
if (box_is_solved(row, col)) {
bitmask |= board_get_box(row, col);
}
}
return bitmask;
}
/* subtract solved possibilites from the specified column */
void subtract_solved_from_col(Col col) {
assert(col > 0 && col <= BOARD_SIZE);
Row row;
BitVec solved = get_col_solved_bitmask(col);
for (row = 1; row <= BOARD_SIZE; row++) {
if (!box_is_solved(row, col)) {
/* subtract the solved bits from this box. */
board_subtract_box(row, col, solved);
}
}
}
BoxAndCoord** get_col_unsolved_boxes(Col col, int nunsolved) {
int i, row;
BoxAndCoord** arr = dbg_malloc(sizeof(BoxAndCoord*) * nunsolved);
i = 0;
for(row = 1; row <= BOARD_SIZE; row++) {
if (!box_is_solved(row, col)) {
arr[i] = new_box_and_coord(board_get_box(row, col), row);
i++;
}
}
assert(i == nunsolved);
return arr;
}
/* find tuples in a column and subtract them from any unsolved boxes which aren't part of the tuple */
void match_tuples_col(Col col) {
assert(col > 0 && col <= BOARD_SIZE);
Row row;
int i;
int nunsolved = BOARD_SIZE - count_set_bits(get_col_solved_bitmask(col));
if (nunsolved > 2) {
BoxAndCoord** arr = get_col_unsolved_boxes(col, nunsolved);
TupleCounter* counter = find_all_tuples(arr, nunsolved);
free_box_and_coord_array(arr, nunsolved);
if (counter->count > 0) {
print_tuple_counter(counter, "col", col, "row");
/* subtract found tuples from other columns in the row */
for (i = 0; i < counter->count; i++) {
Tuple* tuple = counter->tuples[i];
for (row = 1; row <= BOARD_SIZE; row++) {
if (!box_is_num_set(tuple->coord_bitvec, row)) {
board_subtract_box(row, col, tuple->box);
}
}
}
}
free_tuple_counter(counter);
}
}
/* subtract possibilities exclusive to a grid in a col from the other boxes in a grid */
void subtract_exclusive_from_grid_except_col(
Box exclusive_box,
Grid grid,
Col col
) {
assert(grid > 0 && grid <= BOARD_SIZE);
assert(col > 0 && col <= BOARD_SIZE);
int num;
for (num = 1; num <= BOARD_SIZE; num++) {
int c = get_col_for_grid(grid, num);
if (c != col) {
int r = get_row_for_grid(grid, num);
if (!box_is_solved(r, c)) {
/* board_subtract_box(r, c, exclusive_box); */
}
}
}
}
/* find possibilities in a col which are exclusive to a single grid,
then subtract those possibilities from the other boxes in the grid */
void match_grid_exclusive_in_col(Col col) {
assert(col > 0 && col <= BOARD_SIZE);
Row row;
int i, j;
Box box_by_grid[GRID_SIZE] = {0};
BitVec solved = get_col_solved_bitmask(col);
for (row = 1; row <= BOARD_SIZE; row++) {
if (!box_is_solved(row, col)) {
int index = (row - 1) / GRID_SIZE;
assert(index >= 0 && index < GRID_SIZE);
box_by_grid[index] |= board_get_box(row, col);
}
}
for (i = 1; i <= GRID_SIZE; i++) {
Box grid_box = box_by_grid[i-1];
Box union_box = 0;
for (j = 1; j <= GRID_SIZE; j++) {
if (j != i) {
union_box |= box_by_grid[j-1];
}
}
Box exclusive = (grid_box &= ~(union_box | solved));
if (exclusive != 0) {
Grid grid = ((i - 1) * GRID_SIZE) + ((col - 1) / GRID_SIZE) + 1;
trace_exclusive_found("col", col, exclusive, grid);
subtract_exclusive_from_grid_except_col(exclusive, grid, col);
}
}
}
/* iterate over the columns, reducing possibilities. */
void iterate_columns(int pass) {
Col col;
for (col = 1; col <= BOARD_SIZE; col++) {
subtract_solved_from_col(col);
/* ignore the first pass */
if (pass > 1) {
match_tuples_col(col);
match_grid_exclusive_in_col(col);
}
}
}
/* return the combined bitmask of all solved boxes in the grid. */
BitVec get_grid_solved_bitmask(Grid grid) {
assert(grid > 0 && grid <= BOARD_SIZE);
int num;
BitVec bitmask = 0;
for (num = 1; num <= BOARD_SIZE; num++) {
Row row = get_row_for_grid(grid, num);
Col col = get_col_for_grid(grid, num);
if (box_is_solved(row, col)) {
bitmask |= board_get_box(row, col);
}
}
return bitmask;
}
/* subtract solved possibilites from unsolved boxes in the specified grid */
void subtract_solved_from_grid(Grid grid) {
assert(grid > 0 && grid <= BOARD_SIZE);
Num num;
BitVec solved = get_grid_solved_bitmask(grid);
for (num = 1; num <= BOARD_SIZE; num++) {
Row row = get_row_for_grid(grid, num);
Col col = get_col_for_grid(grid, num);
if (!box_is_solved(row, col)) {
/* subtract the solved bits from this box. */
board_subtract_box(row, col, solved);
}
}
}
/* get unsolved box/coords pairs for a grid */
BoxAndCoord** get_grid_unsolved_boxes(Grid grid, int nunsolved) {
int i = 0;
Num num;
BoxAndCoord** arr = dbg_malloc(sizeof(BoxAndCoord*) * nunsolved);
for(num = 1; num <= BOARD_SIZE; num++) {
Row row = get_row_for_grid(grid, num);
Col col = get_col_for_grid(grid, num);
if (!box_is_solved(row, col)) {
arr[i] = new_box_and_coord(board_get_box(row, col), num);
i++;
}
}
if (i != nunsolved) {
fprintf(stderr, "get_grid_unsolved_boxes(%d, %d)\n", grid, nunsolved);
print_board();
}
assert(i == nunsolved);
return arr;
}
/* find tuples in a grid and subtract them from any unsolved boxes which aren't part of the tuple */
void match_tuples_grid(Grid grid) {
assert(grid > 0 && grid <= BOARD_SIZE);
Num num;
int i, nunsolved = BOARD_SIZE - count_set_bits(get_grid_solved_bitmask(grid));
if (nunsolved > 2) {
BoxAndCoord** arr = get_grid_unsolved_boxes(grid, nunsolved);
/* find tuples of size k in the different combinations of the unsolved grid array */
TupleCounter* counter = find_all_tuples(arr, nunsolved);
free_box_and_coord_array(arr, nunsolved);
if (counter->count > 0) {
print_tuple_counter(counter, "grid", grid, "idxs");
/* subtract found tuples from other columns in the row */
for (i = 0; i < counter->count; i++) {
Tuple* tuple = counter->tuples[i];
for (num = 1; num <= BOARD_SIZE; num++) {
if (!box_is_num_set(tuple->coord_bitvec, num)) {
Row row = get_row_for_grid(grid, num);
Col col = get_col_for_grid(grid, num);
board_subtract_box(row, col, tuple->box);
}
}
}
}
free_tuple_counter(counter);
}
}
/* iterate over the grids, reducing possibilities. */
void iterate_grids(int pass) {
Grid grid;
for (grid = 1; grid <= BOARD_SIZE; grid++) {
subtract_solved_from_grid(grid);
/* ignore the first pass */
if (pass > 1) {
match_tuples_grid(grid);
}
}
}
/* iterate over the board, reducing possibilities. */
void iterate_board(int pass) {
if (g_trace) {
printf("pass: %d\n", pass);
}
iterate_rows(pass);
iterate_columns(pass);
iterate_grids(pass);
tprint_board();
}
int main(int argc, char* argv[]) {
load_board(argv[1]);
bool deadlocked = false;
int pass = 1;
int before_pcount = possibilities_count();
while (!board_is_solved()) {
iterate_board(pass++);
int after_pcount = possibilities_count();
if (after_pcount == before_pcount) {
deadlocked = true;
break;
} else {
before_pcount = after_pcount;
continue;
}
}
if (deadlocked) {
if (g_trace) {
printf("deadlocked:\n");
}
tprint_board();
fprintf(stderr, "Error: don't know how to solve this board.\n");
} else {
if (g_trace) {
printf("solution:\n");
}
print_board();
}
/* make sure that there are no memory leaks */
dbg_alloc_check();
return EXIT_SUCCESS;
}
503020000
000300000
900500060
000802006
000007002
000000395
060000900
080041000
002000040
530070000
600195000
098000060
800060003
400803001
700020006
060000280
000419005
000080079
@btpnlsl
Copy link
Author

btpnlsl commented Jan 8, 2025

One other housekeeping refactor I am doing in this code is make sure that the row/col/grid/idx coordinates start at 1 instead of 0, and to use consistent indexing throughout the code.

I suppose I should outline what I mean by coordinates.
The most straight forward are row/col coordinates, which refer to sudoku boxes like so:

      1    2     3     4     5     6     7     8     9
  1 [1,1] [1,2] [1,3] [1,4] [1,5] [1,6] [1,7] [1,8] [1,9]
  2 [2,1] ...
  3 [3,1]
  4 [4,1]
  5 [5,1]
  6 [6,1]
  7 [7,1]
  8 [8,1]
  9 [9,1]

Alternately a box can be referred to using grid/idx coordinates like so:

   [1,1] [1,2] [1,3] [2,1] [2,2] [2,3] [3,1] [3,2] [3,3]
   [1,4] [1,5] [1,6] ...
   [1,7] [1,8] [1,9]
   [4,1] [4,2] [4,3]
   [4,4] [4,5] [4,6]    
   [4,7] [4,8] [4,9]
   [7,1] [7,2] [7,3]
   [7,4] [7,5] [7,6]
   [7,7] [7,8] [7,9]

First we need some methods to convert between row/col coordinates and grid/idx coordinates:

/* methods for translating between row/col coordinates and grid/num coordinates. all coordinates are 1-indexed [1..9] */
int get_row_for_grid(int grid, int idx) {
    assert(grid > 0 && grid <= BOARD_SIZE);
    assert(num > 0 && num <= BOARD_SIZE);
    return (((grid - 1) / GRID_SIZE) * GRID_SIZE) + ((idx- 1) / GRID_SIZE) + 1;
}

int get_col_for_grid(int grid, int idx) {
    assert(grid > 0 && grid <= BOARD_SIZE);
    assert(num > 0 && num <= BOARD_SIZE);
    return (((grid - 1) % GRID_SIZE) * GRID_SIZE) + ((idx- 1) % GRID_SIZE) + 1;
}

int get_grid_for_coords(int row, int col) {
    assert(row > 0 && row <= BOARD_SIZE);
    assert(col > 0 && col <= BOARD_SIZE);
    return (((row - 1) / GRID_SIZE) * GRID_SIZE) + ((col - 1) / GRID_SIZE) + 1;
}

int get_idx_for_coords(int row, int col) {
    assert(row > 0 && row <= BOARD_SIZE);
    assert(col > 0 && col <= BOARD_SIZE);
    return (((row - 1) % GRID_SIZE) * GRID_SIZE) + ((col - 1) % GRID_SIZE) + 1;
}

To prevent confusion when using 1-based coordinates, it would help to limiting the access to the global board array to specific methods which properly convert the coordinates to C array indexes. So rather than have methods pull box possibilities directly from the board variable, let's make a method to get the box based on a [row/col]:

uint16_t board_get_box(int row, int col) {
    assert(row > 0 && row <= BOARD_SIZE);
    assert(col > 0 && col <= BOARD_SIZE);
    return g_board[row-1][col-1];
}

Likewise we can have methods which set specific possibility values for a box in the board or subtract a possibility bitvec from a box in the board:

/* add num to the box's bitvector. */
void board_set_num(int row, int col, int num) {
    assert(row > 0 && row <= BOARD_SIZE);
    assert(col > 0 && col <= BOARD_SIZE);
    assert(num > 0 && num <= BOARD_SIZE);
    g_board[row-1][col-1] |= num_to_bitvec(num);
}

/* subtract the possibilities in the specified bitvec from a box in the board */
void board_subtract_box(int row, int col, uint16_t bitvec) {
    assert(row > 0 && row <= BOARD_SIZE);
    assert(col > 0 && col <= BOARD_SIZE);      
    uint16_t box = g_board[row-1][col-1];
    box &= ~(bitvec);
    assert(count_set_bits(box) > 0);
    g_board[row-1][col-1] = box;
}

An added value for these methods is that the code can assert if variables are out of bounds or would cause an invalid state. For example in board_subtract_box the code will assert if subtracting the supplied bitvec would leave the box with no possibilities.

@btpnlsl
Copy link
Author

btpnlsl commented Jan 8, 2025

Another way to eliminate possibilities is to recognize when they are exclusive to a grid when scanning a row/col.

For example, if I have a puzzle like this

[5 6]           [1 2 4 6 8]     9               7               [1 2 5 8]       [1 4 8]         [2 4 5 6]       [2 4 6]         3
3               [2 4 6 8]       [2 4 5 7 8]     9               [2 5 8]         [4 8]           1               [2 4 6 7]       [2 4]
[5 7]           [1 2 4]         [1 2 4 5 7]     3               [1 2 5]         6               [2 4 5 7 9]     [2 4 7 9]       8

9               5               6               [1 2 8]         4               7               [2 3 8]         [1 2 3]         [1 2]
2               7               3               [1 8]           [1 8 9]         5               [4 8 9]         [1 4 9]         6
4               [1 8]           [1 8]           [2 6]           [2 3 6 9]       [3 9]           [2 3 9]         5               7

[6 7]           3               [4 7]           [1 4 6]         [1 6 7 9]       2               [4 6 7 9]       8               5
8               [2 4 6 9]       [2 4 5 7]       [1 4 5 6]       [1 3 5 6 7 9]   [1 3 4 9]       [2 3 4 6 7 9]   [1 2 3 4 6 7 9] [1 2 4 9]
1               [2 4 6 9]       [2 4 5 7]       [4 5 6 8]       [3 5 6 7 8 9]   [3 4 8 9]       [2 3 4 6 7 9]   [2 3 4 6 7 9]   [2 4 9]

The value 5 is exclusive to grid 1 in column 1 since it is only present in row/col [1,1] & [3,1] but not [7,1]. Because of that it is impossible for 5 to appear in any other column in grid 1, so 5 can be removed from r/c [2,3] & [3,3] (g/i [1, 6] & [1,9]).

There are possibilities in a row/col that are exclusive to a grid when the union (all unsolved boxes that are in the grid & r/c) - union(all other unsolved boxes in the r/c + all solved boxes in the r/c) > 0.

@btpnlsl
Copy link
Author

btpnlsl commented Jan 8, 2025

Expressing this in code looks something like this:

/* find possibilities in a row which are exclusive to a single grid, 
   then subtract those possibilities from the other boxes in the grid */
void match_grid_exclusive_in_row(Row row) {
    assert(row > 0 && row <= BOARD_SIZE);
    Col col;
    int i, j;
    Box box_by_grid[GRID_SIZE] = {0};
    BitVec solved = get_row_solved_bitmask(row);

    /* create a union bitvec for all the possibilies for each grid in the row */
    for (col = 1; col <= BOARD_SIZE; col++) {
        if (!box_is_solved(row, col)) {
            int index = (col - 1) / GRID_SIZE;
            assert(index >= 0 && index < GRID_SIZE);
            box_by_grid[index] |= board_get_box(row, col);
        }
    }

    for (i = 1; i <= GRID_SIZE; i++) {
        /* iterate through each grid in the row and bucket the bitvecs to either the current 
           grid or the union of the possibilites in all the other grids */
        Box grid_box = box_by_grid[i - 1];
        Box union_box = 0;
        for (j = 1; j <= GRID_SIZE; j++) {
            if (j != i) {
                union_box |= box_by_grid[j - 1];
            }
        }

        /* possibilities in the row are exclusive to the grid if they are not present in any of 
           the other grids in the row, and the possibility is not already solved */
        Box exclusive = (grid_box &= ~(union_box | solved));
        if (count_set_bits(exclusive) > 0) {
            Grid grid = (((row - 1) / GRID_SIZE) * GRID_SIZE) + ((i - 1) % GRID_SIZE) + 1;

            trace_exclusive_found("row", row, exclusive, grid);

            subtract_exclusive_from_grid_except_row(exclusive, grid, row);
        }
    }
}

/* subtract possibilities exclusive to a row from the other boxes in a grid */
void subtract_exclusive_from_grid_except_row(
    Box exclusive_box,
    Grid grid,
    Row row
) {
    assert(grid > 0 && grid <= BOARD_SIZE);
    assert(row > 0 && row <= BOARD_SIZE);
    Num num;

    for (num = 1; num <= BOARD_SIZE; num++) {
        Row r = get_row_for_grid(grid, num);
        if (r != row) {
            Col c = get_col_for_grid(grid, num);

            if (!box_is_solved(r, c)) {
                /* board_subtract_box(r, c, exclusive_box); */
            }
        }
    }
}

@btpnlsl
Copy link
Author

btpnlsl commented Jan 8, 2025

I've also started using C's typedef to make aliases for primitive data types for things like coordinates, boxes and bit vectors.

Ideally this would keep them from being interchangeable except for certain methods which would be designed to convert between say, a Row & Col to a Grid & Idx.

But typedef works enough so that I'm not just passing around ints and uint16_ts around and getting confused by which variable is which.

@btpnlsl
Copy link
Author

btpnlsl commented Jan 8, 2025

At this point the Sudoku solver has a pretty good success rate (meaning I haven't had a deadlock using the NYT puzzles for the past week).

There's some more work to do related to recognizing when a grid has possibilities that are exclusive to a row/col that would eliminate the other possibilities in the row/col (see the second situation referenced here. But I may do that if I discover an unsolvable puzzle.

@btpnlsl
Copy link
Author

btpnlsl commented Jan 16, 2025

The NYT Sudoku problems are:

specifically selected so they are always solvable with techniques called the Basics

  • Naked and Hidden Singles
  • Box/Line Reductions (pointing or claiming digits)
  • Naked and Hidden Subsets (Pairs, Triples, Quads)

There are a bunch of additional, harder techniques which can stump the sudoku solver.

@btpnlsl
Copy link
Author

btpnlsl commented Jan 26, 2025

The X-Wing pattern helps eliminate possibilities when there exists 2 rows or columns which both have 2 & only 2 possibilities for the column or row.

image

For example, in the 3rd & 8th row there number 5 is only possible in the 6th and 9th column.
The number 5 must be present in either the 6th or 9th column in rows 3 and 8 in order to solve the puzzle, therefore the number 5 can be eliminated as a possibility in all other rows (1, 2, 7 & 9) in the 6th and 9th column.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment