Patching
Patching strategies and factory for the next-generation dataset.
FixedPatching
Patching strategy that returns patches from a fixed sequence.
Implements the PatchingStrategy protocol.
Parameters:
-
fixed_patch_specs(Sequence[PatchSpecs]) –Sequence of patch specifications to return in order.
n_patches
property
Number of patches; max index for get_patch_spec.
Returns:
-
int–Length of the fixed patch specs sequence.
__init__(fixed_patch_specs)
Constructor.
Parameters:
-
fixed_patch_specs(Sequence[PatchSpecs]) –Sequence of patch specifications.
get_patch_indices(data_idx)
Return all patch indices belonging to a specific image_stack.
Each image_stack corresponds to a given data_idx.
Parameters:
-
data_idx(int) –An index that corresponds to a given
image_stack.
Returns:
-
sequence of int–A sequence of patch indices belonging to a particular
image_stackthat can be used to index theCAREamicsDataset.
get_patch_spec(index)
Return the patch specs for a given index.
Parameters:
-
index(int) –A patch index.
Returns:
-
PatchSpecs–A dictionary that specifies a single patch in a series of
ImageStacks.
FixedRandomPatching
Deterministic random patching strategy for validation.
Parameters:
-
data_shapes(sequence of (sequence of int)) –Shapes of the underlying data (axes SC(Z)YX).
-
patch_size(sequence of int) –Patch size per spatial dimension (length 2 or 3).
-
seed(int or None, default:None) –Seed for reproducibility.
Notes
The output of get_patch_spec is deterministic (same index gives same output).
The number of patches per sample is based on sequential non-overlapping coverage.
n_patches
property
The number of patches that this patching strategy will return.
It also determines the maximum index that can be given to get_patch_spec.
Returns:
-
int–Number of patches.
__init__(data_shapes, patch_size, seed=None)
A patching strategy for sampling random patches.
Parameters:
-
data_shapes(sequence of (sequence of int)) –The shapes of the underlying data. Each element is the dimension of the axes SC(Z)YX.
-
patch_size(sequence of int) –The size of the patch. The sequence will have length 2 or 3, for 2D and 3D data respectively.
-
seed(int, default:None) –An optional seed to ensure the reproducibility of the random patches.
get_patch_indices(data_idx)
Get the patch indices will return patches for a specific image_stack.
The image_stack corresponds to the given data_idx.
Parameters:
-
data_idx(int) –An index that corresponds to a given
image_stack.
Returns:
-
sequence of int–A sequence of patch indices, that when used to index the
CAREamicsDataset will return a patch that comes from theimage_stackcorresponding to the givendata_idx`.
get_patch_spec(index)
Return the patch specs for a given index.
Parameters:
-
index(int) –A patch index.
Returns:
-
PatchSpecs–A dictionary that specifies a single patch in a series of
ImageStacks.
PatchSpecs
Bases: TypedDict
A dictionary that specifies a single patch in a series of ImageStacks.
Attributes:
-
data_idx(int) –Determines which
ImageStacka patch belongs to, within a series ofImageStacks. -
sample_idx(int) –Determines which sample a patch belongs to, within an
ImageStack. -
coords(sequence of int) –The top-left (and first z-slice for 3D data) of a patch. The sequence will have length 2 or 3, for 2D and 3D data respectively.
-
patch_size(sequence of int) –The size of the patch. The sequence will have length 2 or 3, for 2D and 3D data respectively.
Patching
Bases: Protocol
An interface for patching strategies.
Patching strategies are a component of the CAREamicsDataset; they determine
how patches are extracted from the underlying data.
Attributes:
Methods:
-
get_patch_spec–Get a patch specification for a given patch index.
n_patches
property
The number of patches that the patching strategy will return.
It also determines the maximum index that can be given to get_patch_spec,
and the length of the CAREamicsDataset.
Returns:
-
int–Number of patches.
get_patch_indices(data_idx)
Get the patch indices will return patches for a specific image_stack.
The image_stack corresponds to the given data_idx.
Parameters:
-
data_idx(int) –An index that corresponds to a given
image_stack.
Returns:
-
sequence of int–A sequence of patch indices, that when used to index the
CAREamicsDataset will return a patch that comes from theimage_stackcorresponding to the givendata_idx`.
get_patch_spec(index)
Get a patch specification for a given patch index.
This method is intended to be called from within the
CAREamicsDataset.__getitem__. The index will be passed through from this
method.
Parameters:
-
index(int) –A patch index.
Returns:
-
PatchSpecs–A dictionary that specifies a single patch in a series of
ImageStacks.
RandomPatching
Random patching strategy.
Parameters:
-
data_shapes(sequence of (sequence of int)) –Shapes of the underlying data (axes SC(Z)YX).
-
patch_size(sequence of int) –Patch size per spatial dimension (length 2 or 3).
-
seed(int or None, default:None) –Seed for reproducibility of random patches.
Notes
The output of get_patch_spec will be random, i.e. if the same index is given
twice the two outputs can be different. The strategy still ensures a known number
of patches per sample per image stack via bins; the index determines
"data_idx" and "sample_idx" in the returned PatchSpecs, while "coords" are
random. The number of patches per sample is based on sequential non-overlapping
coverage of the array.
n_patches
property
The number of patches that this patching strategy will return.
It also determines the maximum index that can be given to get_patch_spec.
Returns:
-
int–Number of patches.
__init__(data_shapes, patch_size, seed=None)
A patching strategy for sampling random patches.
Parameters:
-
data_shapes(sequence of (sequence of int)) –The shapes of the underlying data. Each element is the dimension of the axes SC(Z)YX.
-
patch_size(sequence of int) –The size of the patch. The sequence will have length 2 or 3, for 2D and 3D data respectively.
-
seed(int, default:None) –An optional seed to ensure the reproducibility of the random patches.
get_patch_indices(data_idx)
Get the patch indices will return patches for a specific image_stack.
The image_stack corresponds to the given data_idx.
Parameters:
-
data_idx(int) –An index that corresponds to a given
image_stack.
Returns:
-
sequence of int–A sequence of patch indices, that when used to index the
CAREamicsDataset will return a patch that comes from theimage_stackcorresponding to the givendata_idx`.
get_patch_spec(index)
Return the patch specs for a given index.
Parameters:
-
index(int) –A patch index.
Returns:
-
PatchSpecs–A dictionary that specifies a single patch in a series of
ImageStacks.
SequentialPatching
Grid patching strategy with optional overlap; prototype.
Parameters:
-
data_shapes(sequence of (sequence of int)) –Shapes of the underlying data (axes SC(Z)YX).
-
patch_size(sequence of int) –Patch size per spatial dimension.
-
overlaps(sequence of int or None, default:None) –Overlap per axis; if None, no overlap.
n_patches
property
__init__(data_shapes, patch_size, overlaps=None)
Initialize sequential patching with optional overlap per axis.
Parameters:
-
data_shapes(sequence of (sequence of int)) –Shapes of the underlying data (axes SC(Z)YX).
-
patch_size(sequence of int) –Patch size per spatial dimension.
-
overlaps(sequence of int or None, default:None) –Overlap per axis; if None, no overlap.
get_patch_indices(data_idx)
Get the patch indices will return patches for a specific image_stack.
The image_stack corresponds to the given data_idx.
Parameters:
-
data_idx(int) –An index that corresponds to a given
image_stack.
Returns:
-
sequence of int–A sequence of patch indices, that when used to index the
CAREamicsDataset will return a patch that comes from theimage_stackcorresponding to the givendata_idx`.
get_patch_spec(index)
Return the patch spec for the given index.
Parameters:
-
index(int) –Patch index.
Returns:
-
PatchSpecs–Patch spec for that index.
StratifiedPatching
Stratified patching strategy allowing patches on a grid to be excluded.
Patches will be sampled from sampling regions that are two times the patch size in each dimension. Some sampling regions may be smaller than this because they are on the edge of an image or because a nearby patch has been excluded.
If the same index is used twice to sample a patch with the method get_patch_spec
there will be a high probability that it will come from the same sampling region,
but not necessarily 100%. Smaller sampling regions may be binned together into a
single index. The mean of all the expected values that each pixel will be selected
in a patch per epoch is 1.
The number of patches is determined from the number of selectable patch coordinates.
Parameters:
-
data_shapes(sequence of (sequence of int)) –Shapes of the underlying data (axes SC(Z)YX).
-
patch_size(sequence of int) –Patch size per spatial dimension (length 2 or 3).
-
seed(int or None, default:None) –Seed for reproducibility.
n_patches
property
The number of patches that this patching strategy will return.
It also determines the maximum index that can be given to get_patch_spec.
Returns:
-
int–Number of patches.
__init__(data_shapes, patch_size, seed=None)
A patching strategy for sampling stratified patches.
Parameters:
-
data_shapes(sequence of (sequence of int)) –The shapes of the underlying data. Each element is the dimension of the axes SC(Z)YX.
-
patch_size(sequence of int) –The size of the patch. The sequence will have length 2 or 3, for 2D and 3D data respectively.
-
seed(int, default:None) –An optional seed to ensure the reproducibility of the random patches.
exclude_patches(data_idx, sample_idx, grid_coords)
Exclude patches from being sampled.
Excluded patches must lie on a grid which starts at (0, 0) and has a spacing of
the given patch_size.
After calling this method the number of patches will be recalculated and the
excluded patches will never be returned by get_patch_spec.
Parameters:
-
data_idx(int) –The index of the "image stack" that the patches will be excluded from.
-
sample_idx(int) –An index that corresponds to the sample in the "image stack" that the patches will be excluded from.
-
grid_coords(Sequence[tuple[int, ...]]) –A sequence of 2D or 3D tuples. Each tuple corresponds to a grid coordinate that will be excluded from sampling. The grid starts at (0, 0) and has a spacing of the given
patch_size.
get_all_grid_coords()
get_included_grid_coords()
Get all grid coordinates included in the patching strategy.
If a grid coordinate is not included, a patch can never be selected from the
region [grid_coord*patch_size, (grid_coord+1)*patch_size].
Returns:
get_patch_indices(data_idx)
Return the patch indices for a specific image_stack.
The image_stack corresponds to the given data_idx.
Parameters:
-
data_idx(int) –An index that corresponds to a given
image_stack.
Returns:
-
sequence of int–A sequence of patch indices, used to index the
CAREamicsDatasetto return a patch that comes from theimage_stackcorresponding to the givendata_idx.
get_patch_spec(index)
Return the patch specs for a given index.
Parameters:
-
index(int) –A patch index.
Returns:
-
PatchSpecs–A dictionary that specifies a single patch in a series of
ImageStacks.
set_region_probs(data_idx, sample_idx, probs)
Set the probability that regions will be sampled from at each epoch.
Parameters:
-
data_idx(int) –The index of the "image stack" that the patches will be excluded from.
-
sample_idx(int) –An index that corresponds to the sample in the "image stack" that the patches will be excluded from.
-
probs(dict[tuple[int, ...], float]) –The probabilities for each region. The keys of the dictionary correspond to the grid coordinates of the regions. The values of the dictionary are the probabilities.
TileSpecs
Bases: PatchSpecs
A dictionary that specifies a single patch in a series of ImageStacks.
Attributes:
-
data_idx(int) –Determines which
ImageStacka patch belongs to, within a series ofImageStacks. -
sample_idx(int) –Determines which sample a patch belongs to, within an
ImageStack. -
coords(sequence of int) –The top-left (and first z-slice for 3D data) of a patch. The sequence will have length 2 or 3, for 2D and 3D data respectively.
-
patch_size(sequence of int) –The size of the patch. The sequence will have length 2 or 3, for 2D and 3D data respectively.
-
crop_coords(sequence of int) –The top-left side of where the tile will be cropped, in coordinates relative to the tile.
-
crop_size(sequence of int) –The size of the cropped tile.
-
stitch_coords(sequence of int) –Where the tile will be stitched back into an image, taking into account that the tile will be cropped, in coords relative to the image.
-
total_tiles(int) –Number of tiles belonging to the same data.
TiledPatching
Patching strategy used to extract overlapping tiles from an image.
The tiling strategy should be used for prediction. The get_patch_specs
method returns TileSpec dictionaries that contains information on how to
stitch the tiles back together to create the full image.
Parameters:
-
data_shapes(sequence of (sequence of int)) –Shapes of the underlying data (axes SC(Z)YX).
-
patch_size(sequence of int) –Tile size per spatial dimension (length 2 or 3).
-
overlaps(sequence of int) –Overlap with adjacent tiles per spatial dimension.
n_patches
property
The number of patches that this patching strategy will return.
It also determines the maximum index that can be given to get_patch_spec.
Returns:
-
int–Number of patches.
__init__(data_shapes, patch_size, overlaps)
Constructor.
Parameters:
-
data_shapes(sequence of (sequence of int)) –The shapes of the underlying data. Each element is the dimension of the axes SC(Z)YX.
-
patch_size(sequence of int) –The size of the tile. The sequence will have length 2 or 3, for 2D and 3D data respectively.
-
overlaps(sequence of int) –How much a tile will overlap with adjacent tiles in each spatial dimension.
get_patch_indices(data_idx)
Get the patch indices will return patches for a specific image_stack.
The image_stack corresponds to the given data_idx.
Parameters:
-
data_idx(int) –An index that corresponds to a given
image_stack.
Returns:
-
sequence of int–A sequence of patch indices, that when used to index the
CAREamicsDataset will return a patch that comes from theimage_stackcorresponding to the givendata_idx`.
WholeSamplePatching
Patching strategy that returns one patch per sample (whole image).
Parameters:
-
data_shapes(sequence of (sequence of int)) –Shapes of the underlying data (axes SC(Z)YX).
n_patches
property
__init__(data_shapes)
Constructor.
Parameters:
-
data_shapes(sequence of (sequence of int)) –Shapes of the underlying data (axes SC(Z)YX).
get_patch_indices(data_idx)
Get the patch indices will return patches for a specific image_stack.
The image_stack corresponds to the given data_idx.
Parameters:
-
data_idx(int) –An index that corresponds to a given
image_stack.
Returns:
-
sequence of int–A sequence of patch indices, that when used to index the
CAREamicsDataset will return a patch that comes from theimage_stackcorresponding to the givendata_idx`.
get_patch_spec(index)
Return the patch spec for the given index.
Parameters:
-
index(int) –Patch index.
Returns:
-
PatchSpecs–Patch spec for that index.
create_patching(data_shapes, patching_config)
Factory function to create a patching strategy based on the provided config.
Parameters:
-
data_shapes(list of Sequence of int) –The shapes of the data stacks to be patched.
-
patching_config(PatchingConfig) –The configuration for the desired patching.
Returns:
-
Patching–An instance of the specified patching.
is_tile_specs(specs)
Determine whether a given PatchSpecs is a TileSpecs.
Used for type checking.
Parameters:
-
specs(PatchSpecs) –A patch specification.
Returns:
-
bool–Whether the given specs is a TileSpecs.