OCR Project
Loading...
Searching...
No Matches
image.h File Reference

Image loading, preprocessing, and normalisation. More...

#include <stddef.h>
#include <stdint.h>
Include dependency graph for image.h:
This graph shows which files directly or indirectly include this file:

Go to the source code of this file.

Classes

struct  Image
 Heap-allocated RGBA pixel buffer (row-major, 4 bytes per pixel). More...
struct  PreprocessParams
 Parameters for the full preprocessing pipeline. More...

Functions

Imageimage_load_png (const char *path)
 Load a PNG file into an Image.
void image_free (Image *img)
 Free an Image allocated by image_load_png().
void image_to_grayscale (Image *img)
 Convert an Image to grayscale in-place.
void image_binarize (Image *img)
 Binarise a grayscale Image in-place using a global threshold.
Imageimage_resize (const Image *img, int w, int h)
 Resize an Image to the given dimensions (nearest-neighbour).
Imageimage_rotate (const Image *img, float angle_deg)
 Rotate an Image by angle_deg degrees clockwise.
void image_to_float (const Image *img, float *out)
 Convert an Image to a flat normalised float array.
int image_load_normalised (const char *path, const PreprocessParams *p, float *out, int out_h, int out_w)
 Full pipeline: load PNG → preprocess → resize → float array.
int image_save_png (const Image *img, const char *path)
 Save an Image as a PNG file.

Detailed Description

Image loading, preprocessing, and normalisation.

Uses libpng for PNG loading (no SDL2_image required). SDL2 is used only for rotation rendering.

Pipeline applied before CNN inference: load PNG → grayscale → binarize → (optional rotate) → resize 28×28 → float[784]

Convention: black ink = 1.0, white background = 0.0.

Function Documentation

◆ image_binarize()

void image_binarize ( Image * img)

Binarise a grayscale Image in-place using a global threshold.

The threshold is the mean pixel luminance of the image. Pixels above the threshold → 255 (white/background). Pixels at or below → 0 (black/ink).

Parameters
imgGrayscale Image (R channel used as luminance).
Here is the caller graph for this function:

◆ image_free()

void image_free ( Image * img)

Free an Image allocated by image_load_png().

Parameters
imgImage to free. No-op if NULL.
Here is the caller graph for this function:

◆ image_load_normalised()

int image_load_normalised ( const char * path,
const PreprocessParams * p,
float * out,
int out_h,
int out_w )

Full pipeline: load PNG → preprocess → resize → float array.

Parameters
pathPath to the source PNG file.
pPreprocessing parameters (may be NULL for defaults).
outOutput buffer of size out_h * out_w floats.
out_hTarget height (e.g. 28 for CNN input).
out_wTarget width (e.g. 28 for CNN input).
Returns
0 on success, -1 on error.
Here is the call graph for this function:
Here is the caller graph for this function:

◆ image_load_png()

Image * image_load_png ( const char * path)

Load a PNG file into an Image.

Parameters
pathPath to the PNG file.
Returns
Heap-allocated Image on success, or NULL on error. Free with image_free().
Here is the call graph for this function:
Here is the caller graph for this function:

◆ image_resize()

Image * image_resize ( const Image * img,
int w,
int h )

Resize an Image to the given dimensions (nearest-neighbour).

Returns a new Image; the original is not modified.

Parameters
imgSource Image.
wTarget width in pixels.
hTarget height in pixels.
Returns
New Image on success, NULL on allocation failure. Free with image_free().
Here is the call graph for this function:
Here is the caller graph for this function:

◆ image_rotate()

Image * image_rotate ( const Image * img,
float angle_deg )

Rotate an Image by angle_deg degrees clockwise.

Returns a new Image large enough to contain the full rotated content. Background pixels are set to white (255, 255, 255, 255).

Parameters
imgSource Image.
angle_degRotation angle in degrees (clockwise).
Returns
New rotated Image, or NULL on failure. Free with image_free().
Here is the call graph for this function:
Here is the caller graph for this function:

◆ image_save_png()

int image_save_png ( const Image * img,
const char * path )

Save an Image as a PNG file.

Parameters
imgImage to save.
pathDestination file path.
Returns
0 on success, -1 on error.

◆ image_to_float()

void image_to_float ( const Image * img,
float * out )

Convert an Image to a flat normalised float array.

Reads the R channel (expected to hold grayscale luminance). Ink (R=0) → 1.0f, background (R=255) → 0.0f.

Parameters
imgSource Image (must be grayscale).
outOutput buffer of size img->width * img->height floats.
Here is the caller graph for this function:

◆ image_to_grayscale()

void image_to_grayscale ( Image * img)

Convert an Image to grayscale in-place.

Uses ITU-R BT.601 luminance weights: L = 0.299·R + 0.587·G + 0.114·B

After this call the R, G, and B channels all hold the luminance value.

Parameters
imgImage to convert.
Here is the caller graph for this function: