Skip to content

Patching

Source

Patching strategies and factory for the next-generation dataset.

FixedPatching

Patching strategy that returns patches from a fixed sequence.

Implements the PatchingStrategy protocol.

Parameters:

  • fixed_patch_specs (Sequence[PatchSpecs]) –

    Sequence of patch specifications to return in order.

n_patches property

Number of patches; max index for get_patch_spec.

Returns:

  • int

    Length of the fixed patch specs sequence.

__init__(fixed_patch_specs)

Constructor.

Parameters:

get_patch_indices(data_idx)

Return all patch indices belonging to a specific image_stack.

Each image_stack corresponds to a given data_idx.

Parameters:

  • data_idx (int) –

    An index that corresponds to a given image_stack.

Returns:

  • sequence of int

    A sequence of patch indices belonging to a particular image_stack that can be used to index the CAREamicsDataset.

get_patch_spec(index)

Return the patch specs for a given index.

Parameters:

  • index (int) –

    A patch index.

Returns:

  • PatchSpecs

    A dictionary that specifies a single patch in a series of ImageStacks.

FixedRandomPatching

Deterministic random patching strategy for validation.

Parameters:

  • data_shapes (sequence of (sequence of int)) –

    Shapes of the underlying data (axes SC(Z)YX).

  • patch_size (sequence of int) –

    Patch size per spatial dimension (length 2 or 3).

  • seed (int or None, default: None ) –

    Seed for reproducibility.

Notes

The output of get_patch_spec is deterministic (same index gives same output). The number of patches per sample is based on sequential non-overlapping coverage.

n_patches property

The number of patches that this patching strategy will return.

It also determines the maximum index that can be given to get_patch_spec.

Returns:

  • int

    Number of patches.

__init__(data_shapes, patch_size, seed=None)

A patching strategy for sampling random patches.

Parameters:

  • data_shapes (sequence of (sequence of int)) –

    The shapes of the underlying data. Each element is the dimension of the axes SC(Z)YX.

  • patch_size (sequence of int) –

    The size of the patch. The sequence will have length 2 or 3, for 2D and 3D data respectively.

  • seed (int, default: None ) –

    An optional seed to ensure the reproducibility of the random patches.

get_patch_indices(data_idx)

Get the patch indices will return patches for a specific image_stack.

The image_stack corresponds to the given data_idx.

Parameters:

  • data_idx (int) –

    An index that corresponds to a given image_stack.

Returns:

  • sequence of int

    A sequence of patch indices, that when used to index the CAREamicsDataset will return a patch that comes from theimage_stackcorresponding to the givendata_idx`.

get_patch_spec(index)

Return the patch specs for a given index.

Parameters:

  • index (int) –

    A patch index.

Returns:

  • PatchSpecs

    A dictionary that specifies a single patch in a series of ImageStacks.

PatchSpecs

Bases: TypedDict

A dictionary that specifies a single patch in a series of ImageStacks.

Attributes:

  • data_idx (int) –

    Determines which ImageStack a patch belongs to, within a series of ImageStacks.

  • sample_idx (int) –

    Determines which sample a patch belongs to, within an ImageStack.

  • coords (sequence of int) –

    The top-left (and first z-slice for 3D data) of a patch. The sequence will have length 2 or 3, for 2D and 3D data respectively.

  • patch_size (sequence of int) –

    The size of the patch. The sequence will have length 2 or 3, for 2D and 3D data respectively.

Patching

Bases: Protocol

An interface for patching strategies.

Patching strategies are a component of the CAREamicsDataset; they determine how patches are extracted from the underlying data.

Attributes:

  • n_patches (int) –

    The number of patches that the patching strategy will return.

Methods:

  • get_patch_spec

    Get a patch specification for a given patch index.

n_patches property

The number of patches that the patching strategy will return.

It also determines the maximum index that can be given to get_patch_spec, and the length of the CAREamicsDataset.

Returns:

  • int

    Number of patches.

get_patch_indices(data_idx)

Get the patch indices will return patches for a specific image_stack.

The image_stack corresponds to the given data_idx.

Parameters:

  • data_idx (int) –

    An index that corresponds to a given image_stack.

Returns:

  • sequence of int

    A sequence of patch indices, that when used to index the CAREamicsDataset will return a patch that comes from theimage_stackcorresponding to the givendata_idx`.

get_patch_spec(index)

Get a patch specification for a given patch index.

This method is intended to be called from within the CAREamicsDataset.__getitem__. The index will be passed through from this method.

Parameters:

  • index (int) –

    A patch index.

Returns:

  • PatchSpecs

    A dictionary that specifies a single patch in a series of ImageStacks.

RandomPatching

Random patching strategy.

Parameters:

  • data_shapes (sequence of (sequence of int)) –

    Shapes of the underlying data (axes SC(Z)YX).

  • patch_size (sequence of int) –

    Patch size per spatial dimension (length 2 or 3).

  • seed (int or None, default: None ) –

    Seed for reproducibility of random patches.

Notes

The output of get_patch_spec will be random, i.e. if the same index is given twice the two outputs can be different. The strategy still ensures a known number of patches per sample per image stack via bins; the index determines "data_idx" and "sample_idx" in the returned PatchSpecs, while "coords" are random. The number of patches per sample is based on sequential non-overlapping coverage of the array.

n_patches property

The number of patches that this patching strategy will return.

It also determines the maximum index that can be given to get_patch_spec.

Returns:

  • int

    Number of patches.

__init__(data_shapes, patch_size, seed=None)

A patching strategy for sampling random patches.

Parameters:

  • data_shapes (sequence of (sequence of int)) –

    The shapes of the underlying data. Each element is the dimension of the axes SC(Z)YX.

  • patch_size (sequence of int) –

    The size of the patch. The sequence will have length 2 or 3, for 2D and 3D data respectively.

  • seed (int, default: None ) –

    An optional seed to ensure the reproducibility of the random patches.

get_patch_indices(data_idx)

Get the patch indices will return patches for a specific image_stack.

The image_stack corresponds to the given data_idx.

Parameters:

  • data_idx (int) –

    An index that corresponds to a given image_stack.

Returns:

  • sequence of int

    A sequence of patch indices, that when used to index the CAREamicsDataset will return a patch that comes from theimage_stackcorresponding to the givendata_idx`.

get_patch_spec(index)

Return the patch specs for a given index.

Parameters:

  • index (int) –

    A patch index.

Returns:

  • PatchSpecs

    A dictionary that specifies a single patch in a series of ImageStacks.

SequentialPatching

Grid patching strategy with optional overlap; prototype.

Parameters:

  • data_shapes (sequence of (sequence of int)) –

    Shapes of the underlying data (axes SC(Z)YX).

  • patch_size (sequence of int) –

    Patch size per spatial dimension.

  • overlaps (sequence of int or None, default: None ) –

    Overlap per axis; if None, no overlap.

n_patches property

Total number of patches.

Returns:

  • int

    Number of patches.

__init__(data_shapes, patch_size, overlaps=None)

Initialize sequential patching with optional overlap per axis.

Parameters:

  • data_shapes (sequence of (sequence of int)) –

    Shapes of the underlying data (axes SC(Z)YX).

  • patch_size (sequence of int) –

    Patch size per spatial dimension.

  • overlaps (sequence of int or None, default: None ) –

    Overlap per axis; if None, no overlap.

get_patch_indices(data_idx)

Get the patch indices will return patches for a specific image_stack.

The image_stack corresponds to the given data_idx.

Parameters:

  • data_idx (int) –

    An index that corresponds to a given image_stack.

Returns:

  • sequence of int

    A sequence of patch indices, that when used to index the CAREamicsDataset will return a patch that comes from theimage_stackcorresponding to the givendata_idx`.

get_patch_spec(index)

Return the patch spec for the given index.

Parameters:

  • index (int) –

    Patch index.

Returns:

StratifiedPatching

Stratified patching strategy allowing patches on a grid to be excluded.

Patches will be sampled from sampling regions that are two times the patch size in each dimension. Some sampling regions may be smaller than this because they are on the edge of an image or because a nearby patch has been excluded.

If the same index is used twice to sample a patch with the method get_patch_spec there will be a high probability that it will come from the same sampling region, but not necessarily 100%. Smaller sampling regions may be binned together into a single index. The mean of all the expected values that each pixel will be selected in a patch per epoch is 1.

The number of patches is determined from the number of selectable patch coordinates.

Parameters:

  • data_shapes (sequence of (sequence of int)) –

    Shapes of the underlying data (axes SC(Z)YX).

  • patch_size (sequence of int) –

    Patch size per spatial dimension (length 2 or 3).

  • seed (int or None, default: None ) –

    Seed for reproducibility.

n_patches property

The number of patches that this patching strategy will return.

It also determines the maximum index that can be given to get_patch_spec.

Returns:

  • int

    Number of patches.

__init__(data_shapes, patch_size, seed=None)

A patching strategy for sampling stratified patches.

Parameters:

  • data_shapes (sequence of (sequence of int)) –

    The shapes of the underlying data. Each element is the dimension of the axes SC(Z)YX.

  • patch_size (sequence of int) –

    The size of the patch. The sequence will have length 2 or 3, for 2D and 3D data respectively.

  • seed (int, default: None ) –

    An optional seed to ensure the reproducibility of the random patches.

exclude_patches(data_idx, sample_idx, grid_coords)

Exclude patches from being sampled.

Excluded patches must lie on a grid which starts at (0, 0) and has a spacing of the given patch_size.

After calling this method the number of patches will be recalculated and the excluded patches will never be returned by get_patch_spec.

Parameters:

  • data_idx (int) –

    The index of the "image stack" that the patches will be excluded from.

  • sample_idx (int) –

    An index that corresponds to the sample in the "image stack" that the patches will be excluded from.

  • grid_coords (Sequence[tuple[int, ...]]) –

    A sequence of 2D or 3D tuples. Each tuple corresponds to a grid coordinate that will be excluded from sampling. The grid starts at (0, 0) and has a spacing of the given patch_size.

get_all_grid_coords()

Get all the grid coordinates for sampling regions in the patching strategy.

Returns:

  • dict[tuple[int, int], list[tuple, ...]]

    Dictionary with keys being (data_idx, sample_idx) and values corresponding to the grid coords.

get_included_grid_coords()

Get all grid coordinates included in the patching strategy.

If a grid coordinate is not included, a patch can never be selected from the region [grid_coord*patch_size, (grid_coord+1)*patch_size].

Returns:

  • dict[tuple[int, int], list[tuple, ...]]

    Dictionary with keys being (data_idx, sample_idx) and values corresponding to the grid coords.

get_patch_indices(data_idx)

Return the patch indices for a specific image_stack.

The image_stack corresponds to the given data_idx.

Parameters:

  • data_idx (int) –

    An index that corresponds to a given image_stack.

Returns:

  • sequence of int

    A sequence of patch indices, used to index the CAREamicsDataset to return a patch that comes from the image_stack corresponding to the given data_idx.

get_patch_spec(index)

Return the patch specs for a given index.

Parameters:

  • index (int) –

    A patch index.

Returns:

  • PatchSpecs

    A dictionary that specifies a single patch in a series of ImageStacks.

set_region_probs(data_idx, sample_idx, probs)

Set the probability that regions will be sampled from at each epoch.

Parameters:

  • data_idx (int) –

    The index of the "image stack" that the patches will be excluded from.

  • sample_idx (int) –

    An index that corresponds to the sample in the "image stack" that the patches will be excluded from.

  • probs (dict[tuple[int, ...], float]) –

    The probabilities for each region. The keys of the dictionary correspond to the grid coordinates of the regions. The values of the dictionary are the probabilities.

TileSpecs

Bases: PatchSpecs

A dictionary that specifies a single patch in a series of ImageStacks.

Attributes:

  • data_idx (int) –

    Determines which ImageStack a patch belongs to, within a series of ImageStacks.

  • sample_idx (int) –

    Determines which sample a patch belongs to, within an ImageStack.

  • coords (sequence of int) –

    The top-left (and first z-slice for 3D data) of a patch. The sequence will have length 2 or 3, for 2D and 3D data respectively.

  • patch_size (sequence of int) –

    The size of the patch. The sequence will have length 2 or 3, for 2D and 3D data respectively.

  • crop_coords (sequence of int) –

    The top-left side of where the tile will be cropped, in coordinates relative to the tile.

  • crop_size (sequence of int) –

    The size of the cropped tile.

  • stitch_coords (sequence of int) –

    Where the tile will be stitched back into an image, taking into account that the tile will be cropped, in coords relative to the image.

  • total_tiles (int) –

    Number of tiles belonging to the same data.

TiledPatching

Patching strategy used to extract overlapping tiles from an image.

The tiling strategy should be used for prediction. The get_patch_specs method returns TileSpec dictionaries that contains information on how to stitch the tiles back together to create the full image.

Parameters:

  • data_shapes (sequence of (sequence of int)) –

    Shapes of the underlying data (axes SC(Z)YX).

  • patch_size (sequence of int) –

    Tile size per spatial dimension (length 2 or 3).

  • overlaps (sequence of int) –

    Overlap with adjacent tiles per spatial dimension.

n_patches property

The number of patches that this patching strategy will return.

It also determines the maximum index that can be given to get_patch_spec.

Returns:

  • int

    Number of patches.

__init__(data_shapes, patch_size, overlaps)

Constructor.

Parameters:

  • data_shapes (sequence of (sequence of int)) –

    The shapes of the underlying data. Each element is the dimension of the axes SC(Z)YX.

  • patch_size (sequence of int) –

    The size of the tile. The sequence will have length 2 or 3, for 2D and 3D data respectively.

  • overlaps (sequence of int) –

    How much a tile will overlap with adjacent tiles in each spatial dimension.

get_patch_indices(data_idx)

Get the patch indices will return patches for a specific image_stack.

The image_stack corresponds to the given data_idx.

Parameters:

  • data_idx (int) –

    An index that corresponds to a given image_stack.

Returns:

  • sequence of int

    A sequence of patch indices, that when used to index the CAREamicsDataset will return a patch that comes from theimage_stackcorresponding to the givendata_idx`.

get_patch_spec(index)

Return the tile specs for a given index.

Parameters:

  • index (int) –

    A patch index.

Returns:

  • TileSpecs

    A dictionary that specifies a single patch in a series of ImageStacks.

WholeSamplePatching

Patching strategy that returns one patch per sample (whole image).

Parameters:

  • data_shapes (sequence of (sequence of int)) –

    Shapes of the underlying data (axes SC(Z)YX).

n_patches property

Total number of patches (one per sample).

Returns:

  • int

    Number of patches.

__init__(data_shapes)

Constructor.

Parameters:

  • data_shapes (sequence of (sequence of int)) –

    Shapes of the underlying data (axes SC(Z)YX).

get_patch_indices(data_idx)

Get the patch indices will return patches for a specific image_stack.

The image_stack corresponds to the given data_idx.

Parameters:

  • data_idx (int) –

    An index that corresponds to a given image_stack.

Returns:

  • sequence of int

    A sequence of patch indices, that when used to index the CAREamicsDataset will return a patch that comes from theimage_stackcorresponding to the givendata_idx`.

get_patch_spec(index)

Return the patch spec for the given index.

Parameters:

  • index (int) –

    Patch index.

Returns:

create_patching(data_shapes, patching_config)

Factory function to create a patching strategy based on the provided config.

Parameters:

  • data_shapes (list of Sequence of int) –

    The shapes of the data stacks to be patched.

  • patching_config (PatchingConfig) –

    The configuration for the desired patching.

Returns:

  • Patching

    An instance of the specified patching.

is_tile_specs(specs)

Determine whether a given PatchSpecs is a TileSpecs.

Used for type checking.

Parameters:

Returns:

  • bool

    Whether the given specs is a TileSpecs.