OCR Project
Loading...
Searching...
No Matches
segment.h File Reference

Letter segmentation: detect crossword grid and extract letter cells. More...

#include <stddef.h>
Include dependency graph for segment.h:
This graph shows which files directly or indirectly include this file:

Go to the source code of this file.

Classes

struct  BoundingBox
 Axis-aligned bounding box of one letter cell. More...
struct  SegmentResult
 Result of a grid detection / segmentation pass. More...

Functions

SegmentResultsegment_image (const unsigned char *pixels, int width, int height)
 Segment a binarised image into letter cell bounding boxes.
int segment_detect_grid (const unsigned char *pixels, int width, int height, float min_span, SegmentResult *out)
 Detect grid lines and infer cell geometry.
int segment_connected_components (const unsigned char *pixels, int width, int height, SegmentResult *out)
 Extract letter bounding boxes via connected-component analysis.
void segment_sort_reading_order (SegmentResult *res)
 Sort a SegmentResult's cells in reading order (top-to-bottom, left-to-right).
void segment_result_free (SegmentResult *res)
 Free a SegmentResult returned by segment_image().

Detailed Description

Letter segmentation: detect crossword grid and extract letter cells.

Pipeline:

  1. Detect the crossword grid bounding box in the binarised image.
  2. Identify grid lines to compute cell size and origin.
  3. Return an ordered array of BoundingBox, one per cell, left-to-right, top-to-bottom.
  4. Each bounding box can then be cropped, resized, and fed to the CNN.

Alternatively, for images with clearly separated letter blobs, the connected-component analysis path extracts letter regions directly.

Function Documentation

◆ segment_connected_components()

int segment_connected_components ( const unsigned char * pixels,
int width,
int height,
SegmentResult * out )

Extract letter bounding boxes via connected-component analysis.

Uses an iterative flood-fill (queue-based, no recursion) to group connected black pixels into components. Components whose size and aspect ratio fall within the expected range for a letter are kept.

Parameters
pixelsGrayscale pixel array (0=black, 255=white), row-major.
widthImage width.
heightImage height.
outPre-allocated SegmentResult to fill.
Returns
Number of components found (≥ 0), -1 on error.
Here is the call graph for this function:
Here is the caller graph for this function:

◆ segment_detect_grid()

int segment_detect_grid ( const unsigned char * pixels,
int width,
int height,
float min_span,
SegmentResult * out )

Detect grid lines and infer cell geometry.

Finds horizontal and vertical runs of dark pixels spanning at least min_span proportion of the image dimension to identify grid lines. Returns regular grid cell positions if a consistent grid is detected.

Parameters
pixelsGrayscale pixel array (0=black, 255=white), row-major.
widthImage width.
heightImage height.
min_spanMinimum fraction of dimension a line must span [0, 1].
outPre-allocated SegmentResult to fill.
Returns
1 if a regular grid was detected and out filled, 0 if no grid was found.
Note
STUB — full implementation deferred.
Here is the caller graph for this function:

◆ segment_image()

SegmentResult * segment_image ( const unsigned char * pixels,
int width,
int height )

Segment a binarised image into letter cell bounding boxes.

Tries two strategies in order:

  1. Grid-line detection (for crossword-style images with visible grid).
  2. Connected-component analysis (for images without explicit grid lines).

The returned SegmentResult::cells array is ordered left-to-right, top-to-bottom.

Parameters
pixelsFlat grayscale pixel array (1 byte/pixel, 0=black, 255=white), row-major, width × height bytes.
widthImage width in pixels.
heightImage height in pixels.
Returns
Heap-allocated SegmentResult on success, NULL on error. Free with segment_result_free().
Note
STUB: grid-line detection and connected-component paths are partially implemented; see internal comments for the full algorithm outline.
Here is the call graph for this function:
Here is the caller graph for this function:

◆ segment_result_free()

void segment_result_free ( SegmentResult * res)

Free a SegmentResult returned by segment_image().

Parameters
resResult to free. No-op if NULL.
Here is the caller graph for this function:

◆ segment_sort_reading_order()

void segment_sort_reading_order ( SegmentResult * res)

Sort a SegmentResult's cells in reading order (top-to-bottom, left-to-right).

Cells are grouped into rows based on vertical overlap, then sorted left-to-right within each row.

Parameters
resSegmentResult to sort in-place.
Here is the call graph for this function:
Here is the caller graph for this function: