|
OCR Project
|
Letter segmentation via connected-component analysis and grid detection. More...

Classes | |
| struct | Queue |
| Simple heap-allocated FIFO queue for pixel coordinates. More... | |
Macros | |
| #define | MIN_COMPONENT_SIZE 40 |
| #define | MAX_COMPONENT_SIZE 4000 |
| #define | MIN_ASPECT_RATIO 0.15f |
| #define | MAX_ASPECT_RATIO 6.0f |
Functions | |
| static Queue | queue_create (size_t cap) |
Allocate a Queue with initial capacity cap pixel pairs. | |
| static int | queue_push (Queue *q, int x, int y) |
| Append a pixel coordinate to the queue, growing if needed. | |
| static void | queue_pop (Queue *q, int *x, int *y) |
| Pop the front element. Caller must ensure queue is non-empty. | |
| static int | queue_empty (const Queue *q) |
| Return 1 if the queue has elements, 0 if empty. | |
| static void | queue_free (Queue *q) |
| Free queue memory. | |
| int | segment_connected_components (const unsigned char *pixels, int width, int height, SegmentResult *out) |
| Extract letter bounding boxes via connected-component analysis. | |
| int | segment_detect_grid (const unsigned char *pixels, int width, int height, float min_span, SegmentResult *out) |
| Detect grid lines and infer cell geometry. | |
| static int | bbox_cmp (const void *a, const void *b) |
| qsort comparator: sort BoundingBox by y then x. | |
| void | segment_sort_reading_order (SegmentResult *res) |
| Sort a SegmentResult's cells in reading order (top-to-bottom, left-to-right). | |
| SegmentResult * | segment_image (const unsigned char *pixels, int width, int height) |
| Segment a binarised image into letter cell bounding boxes. | |
| void | segment_result_free (SegmentResult *res) |
| Free a SegmentResult returned by segment_image(). | |
Letter segmentation via connected-component analysis and grid detection.
| #define MAX_ASPECT_RATIO 6.0f |
Maximum aspect ratio (w/h) of a letter bounding box.
| #define MAX_COMPONENT_SIZE 4000 |
Maximum number of black pixels (filters out large blobs / noise).
| #define MIN_ASPECT_RATIO 0.15f |
Minimum aspect ratio (w/h) of a letter bounding box.
| #define MIN_COMPONENT_SIZE 40 |
Minimum number of black pixels in a component to be considered a letter.
|
static |
qsort comparator: sort BoundingBox by y then x.
Boxes with y-centres within 10 pixels of each other are considered the same row and sorted left-to-right.

|
static |
|
static |
Return 1 if the queue has elements, 0 if empty.

|
static |
Free queue memory.

|
static |
Pop the front element. Caller must ensure queue is non-empty.
| q | Queue. |
| x | Receives column. |
| y | Receives row. |

|
static |
Append a pixel coordinate to the queue, growing if needed.
| q | Queue to push to. |
| x | Column. |
| y | Row. |

| int segment_connected_components | ( | const unsigned char * | pixels, |
| int | width, | ||
| int | height, | ||
| SegmentResult * | out ) |
Extract letter bounding boxes via connected-component analysis.
Uses an iterative flood-fill (queue-based, no recursion) to group connected black pixels into components. Components whose size and aspect ratio fall within the expected range for a letter are kept.
| pixels | Grayscale pixel array (0=black, 255=white), row-major. |
| width | Image width. |
| height | Image height. |
| out | Pre-allocated SegmentResult to fill. |


| int segment_detect_grid | ( | const unsigned char * | pixels, |
| int | width, | ||
| int | height, | ||
| float | min_span, | ||
| SegmentResult * | out ) |
Detect grid lines and infer cell geometry.
Finds horizontal and vertical runs of dark pixels spanning at least min_span proportion of the image dimension to identify grid lines. Returns regular grid cell positions if a consistent grid is detected.
| pixels | Grayscale pixel array (0=black, 255=white), row-major. |
| width | Image width. |
| height | Image height. |
| min_span | Minimum fraction of dimension a line must span [0, 1]. |
| out | Pre-allocated SegmentResult to fill. |
out filled, 0 if no grid was found.
| SegmentResult * segment_image | ( | const unsigned char * | pixels, |
| int | width, | ||
| int | height ) |
Segment a binarised image into letter cell bounding boxes.
Tries two strategies in order:
The returned SegmentResult::cells array is ordered left-to-right, top-to-bottom.
| pixels | Flat grayscale pixel array (1 byte/pixel, 0=black, 255=white), row-major, width × height bytes. |
| width | Image width in pixels. |
| height | Image height in pixels. |


| void segment_result_free | ( | SegmentResult * | res | ) |
Free a SegmentResult returned by segment_image().
| res | Result to free. No-op if NULL. |

| void segment_sort_reading_order | ( | SegmentResult * | res | ) |
Sort a SegmentResult's cells in reading order (top-to-bottom, left-to-right).
Cells are grouped into rows based on vertical overlap, then sorted left-to-right within each row.
| res | SegmentResult to sort in-place. |

