Dataset

Dataset module.

`InMemoryDataset`

Bases: Dataset

Dataset storing data in memory and allowing generating patches from it.

Parameters:

Name	Type	Description	Default
`data_config`	`CAREamics DataConfig`	(see careamics.config.data_model.DataConfig) Data configuration.	required
`inputs`	`ndarray or list[Path]`	Input data.	required
`input_target`	`ndarray or list[Path]`	Target data, by default None.	`None`
`read_source_func`	`Callable`	Read source function for custom types, by default read_tiff.	`read_tiff`
`**kwargs`	`Any`	Additional keyword arguments, unused.	`{}`

`get_data_statistics()`

Return training data statistics.

This does not return the target data statistics, only those of the input.

Returns:

Type	Description
`tuple of list of floats`	Means and standard deviations across channels of the training data.

`split_dataset(percentage=0.1, minimum_patches=1)`

Split a new dataset away from the current one.

This method is used to extract random validation patches from the dataset.

Parameters:

Name	Type	Description	Default
`percentage`	`float`	Percentage of patches to extract, by default 0.1.	`0.1`
`minimum_patches`	`int`	Minimum number of patches to extract, by default 5.	`1`

Returns:

Type	Description
`CAREamics InMemoryDataset`	New dataset with the extracted patches.

Raises:

Type	Description
`ValueError`	If `percentage` is not between 0 and 1.
`ValueError`	If `minimum_number` is not between 1 and the number of patches.

`InMemoryPredDataset`

Bases: Dataset

Simple prediction dataset returning images along the sample axis.

Parameters:

Name	Type	Description	Default
`prediction_config`	`InferenceConfig`	Prediction configuration.	required
`inputs`	`NDArray`	Input data.	required

`InMemoryTiledPredDataset`

Bases: Dataset

Prediction dataset storing data in memory and returning tiles of each image.

Parameters:

Name	Type	Description	Default
`prediction_config`	`InferenceConfig`	Prediction configuration.	required
`inputs`	`NDArray`	Input data.	required

`IterablePredDataset`

Bases: IterableDataset

Simple iterable prediction dataset.

Parameters:

Name	Type	Description	Default
`prediction_config`	`InferenceConfig`	Inference configuration.	required
`src_files`	`List[Path]`	List of data files.	required
`read_source_func`	`Callable`	Read source function for custom types, by default read_tiff.	`read_tiff`
`**kwargs`	`Any`	Additional keyword arguments, unused.	`{}`

Attributes:

Name	Type	Description
`data_path`	`Union[str, Path]`	Path to the data, must be a directory.
`axes`	`str`	Description of axes in format STCZYX.
`mean`	`(Optional[float], optional)`	Expected mean of the dataset, by default None.
`std`	`(Optional[float], optional)`	Expected standard deviation of the dataset, by default None.
`patch_transform`	`(Optional[Callable], optional)`	Patch transform callable, by default None.

`IterableTiledPredDataset`

Bases: IterableDataset

Tiled prediction dataset.

Parameters:

Name	Type	Description	Default
`prediction_config`	`InferenceConfig`	Inference configuration.	required
`src_files`	`list of pathlib.Path`	List of data files.	required
`read_source_func`	`Callable`	Read source function for custom types, by default read_tiff.	`read_tiff`
`**kwargs`	`Any`	Additional keyword arguments, unused.	`{}`

Attributes:

Name	Type	Description
`data_path`	`str or Path`	Path to the data, must be a directory.
`axes`	`str`	Description of axes in format STCZYX.
`mean`	`(float, optional)`	Expected mean of the dataset, by default None.
`std`	`(float, optional)`	Expected standard deviation of the dataset, by default None.
`patch_transform`	`(Callable, optional)`	Patch transform callable, by default None.

`PathIterableDataset`

Bases: IterableDataset

Dataset allowing extracting patches w/o loading whole data into memory.

Parameters:

Name	Type	Description	Default
`data_config`	`DataConfig`	Data configuration.	required
`src_files`	`list of pathlib.Path`	List of data files.	required
`target_files`	`list of pathlib.Path`	Optional list of target files, by default None.	`None`
`read_source_func`	`Callable`	Read source function for custom types, by default read_tiff.	`read_tiff`

Attributes:

Name	Type	Description
`data_path`	`list of pathlib.Path`	Path to the data, must be a directory.
`axes`	`str`	Description of axes in format STCZYX.

`get_data_statistics()`

Return training data statistics.

Returns:

Type	Description
`tuple of list of floats`	Means and standard deviations across channels of the training data.

`get_number_of_files()`

Return the number of files in the dataset.

Returns:

Type	Description
`int`	Number of files in the dataset.

`split_dataset(percentage=0.1, minimum_number=5)`

Split up dataset in two.

Splits the datest sing a percentage of the data (files) to extract, or the minimum number of the percentage is less than the minimum number.

Parameters:

Name	Type	Description	Default
`percentage`	`float`	Percentage of files to split up, by default 0.1.	`0.1`
`minimum_number`	`int`	Minimum number of files to split up, by default 5.	`5`

Returns:

Type	Description
`IterableDataset`	Dataset containing the split data.

Raises:

Type	Description
`ValueError`	If the percentage is smaller than 0 or larger than 1.
`ValueError`	If the minimum number is smaller than 1 or larger than the number of files.