Skip to content

Dataset

Source

Next-generation dataset components.

ImageRegionData

Bases: NamedTuple, Generic[RegionSpecs]

Data structure for arrays produced by the dataset and propagated through models.

An ImageRegionData may be a patch during training/validation, a tile during prediction with tiling, or a whole image during prediction without tiling.

data_shape may not correspond to the shape of the original data if a subset of the channels has been requested, in which case the channel dimension may be smaller than that of the original data and only correspond to the requested number of channels.

ImageRegionData may be collated in batches during training by the DataLoader. In that case: - data: arrays are collated into NDArray of shape (B,C,Z,Y,X) - source: list of str, length B - data_shape: list of tuples of int, each tuple being of length B and representing the shape of the original images in the corresponding dimension - dtype: list of str, length B - axes: list of str, length B - region_spec: dict of {str: sequence}, each sequence being of length B - additional_metadata: list of dict

Description of the fields is given for the uncollated case (non-batched).

additional_metadata instance-attribute

Additional metadata to be stored with the image region. Currently used to store chunk and shard information for zarr image stacks.

axes instance-attribute

Axes of the original data array. SCTZYX dimensions are allowed in any order.

data instance-attribute

Patch, tile or image in C(Z)YX format.

data_shape instance-attribute

Shape of the image in SC(Z)YX format and order. If channels are subsetted, the channel dimension corresponds to the number of requested channels.

dtype instance-attribute

Data type of the original image as a string.

original_data_shape instance-attribute

Original shape of the data before any reshaping.

region_spec instance-attribute

Specifications of the region within the original image from where data is extracted. Of type PatchSpecs during training/validation and prediction without tiling, and TileSpecs during prediction with tiling.

source instance-attribute

Source of the data, e.g. file path, zarr URI, or "array" for in-memory arrays.