Entry point for the OCR + crossword-solver binary. More...

#include "src/cnn/cnn.h"
#include "src/cnn/model.h"
#include "src/preprocess/image.h"
#include "src/segment/segment.h"
#include "src/solver/solver.h"
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

Include dependency graph for solve_main.c:

Classes
struct	SolveArgs
	Parsed command-line options for the solve binary. More...

Macros
#define	DEFAULT_MODEL_DIR "models/"
#define	TTA_N_CROPS 5

Functions
static void	usage (const char *prog)
	Print usage information to stderr.
static int	parse_args (int argc, char *argv, SolveArgs args)
	Parse argv into a SolveArgs structure.
static int	resolve_model (SolveArgs *args)
	Ensure args->model_path is filled: auto-detect if not specified.
static void	forward_region (const Image gray_img, int x1, int y1, int x2, int y2, CNN net, float *probs_out)
	Run one forward pass on a region of the grayscale image.
static int	recognise_cell (const Image gray_img, const BoundingBox box, int cell_size, CNN *net)
	Predict a cell using Test-Time Augmentation (TTA).
static void	search_words (const CharGrid grid, char words)
	Search for all comma-separated words in `args->words`.
int	main (int argc, char **argv)

Detailed Description

Entry point for the OCR + crossword-solver binary.

Usage:

./solve --image <path> [--model <path>] [--words <word1,word2,...>] [-v]

Options: –image <path> (required) Input crossword image. –model <path> (optional) Model file to load. If absent, picks the most recent .bin in models/. –words

list of words to search for. –verbose / -v Print per-cell recognition details and grid.

Pipeline:

Load model.
Load and preprocess the image (grayscale, binarize).
Segment letter cells.
Recognise each cell with the CNN.
Build a character grid.
Solve the word search (if –words is given).

Exit codes: 0 Success. 1 Argument error. 2 Model error (not found / cannot load). 3 Image load error. 4 Segmentation error.

Macro Definition Documentation

◆ DEFAULT_MODEL_DIR

#define DEFAULT_MODEL_DIR "models/"

◆ TTA_N_CROPS

#define TTA_N_CROPS 5

Number of shifted crops averaged for Test-Time Augmentation.

Function Documentation

◆ forward_region()

void forward_region	(	const Image *	gray_img,
		int	x1,
		int	y1,
		int	x2,
		int	y2,
		CNN *	net,
		float *	probs_out )

static

Run one forward pass on a region of the grayscale image.

Extracts [x1,x2) × [y1,y2) from gray_img (RGBA, grayscale), binarizes the region locally, resizes to 28×28 and runs cnn_forward(). The CNN output probabilities are added into probs_out.

Parameters

gray_img	Full grayscale image (RGBA).
x1	y1 x2 y2 Region bounds (clamped to image boundaries internally).
net	Trained CNN.
probs_out	Array of CNN_N_CLASSES floats — result is added here.

Here is the call graph for this function:

Here is the caller graph for this function:

◆ main()

int main	(	int	argc,
		char **	argv )

Here is the call graph for this function:

◆ parse_args()

int parse_args	(	int	argc,
		char **	argv,
		SolveArgs *	args )

static

Parse argv into a SolveArgs structure.

Parameters

argc	Argument count.
argv	Argument vector.
args	Output structure (caller-allocated and zeroed).

Returns: 0 on success, -1 on error.

Here is the caller graph for this function:

◆ recognise_cell()

int recognise_cell	(	const Image *	gray_img,
		const BoundingBox *	box,
		int	cell_size,
		CNN *	net )

static

Predict a cell using Test-Time Augmentation (TTA).

Runs TTA_N_CROPS forward passes with slightly different crop origins (the original crop + 4 shifts of ±shift pixels in x and y), averages the softmax outputs, and returns the argmax class.

The crop window is grid-aware: it is centred on the letter's bounding-box centre and sized to cell_size × cell_size (the detected grid pitch). This guarantees a consistent white border around the letter regardless of how tight the connected-component bounding box is.

Parameters

gray_img	Full grayscale image.
box	Tight bounding box from segmentation.
cell_size	Full grid cell side length (pixels). Pass 0 to fall back to the padding-fraction heuristic.
net	Trained CNN.

Returns: Best class index (0='A'…25='Z'), or -1 on error.

Here is the call graph for this function:

Here is the caller graph for this function:

◆ resolve_model()

int resolve_model ( SolveArgs * args )

static

Ensure args->model_path is filled: auto-detect if not specified.

Parameters

args	SolveArgs with possibly empty model_path.

Returns: 0 if model_path is valid, -1 if no model could be found.

Here is the call graph for this function:

Here is the caller graph for this function:

◆ search_words()

void search_words	(	const CharGrid *	grid,
		char *	words )

static

Search for all comma-separated words in args->words.

Prints each result to stdout in the format: WORD: (r0,c0) → (r1,c1) DIRECTION or: WORD: not found

Parameters

grid	Recognised character grid.
words	Comma-separated word list (modified in-place by strtok).

Here is the call graph for this function:

Here is the caller graph for this function:

◆ usage()

void usage ( const char * prog )

static

Print usage information to stderr.

Parameters

prog	Program name (argv[0]).

Here is the caller graph for this function:

Classes

Macros

Functions

Detailed Description

Macro Definition Documentation

◆ DEFAULT_MODEL_DIR

◆ TTA_N_CROPS

Function Documentation

◆ forward_region()

◆ main()

◆ parse_args()

◆ recognise_cell()

◆ resolve_model()

◆ search_words()

◆ usage()