Dataset
Dataset module.
InMemoryDataset
Bases: Dataset
Dataset storing data in memory and allowing generating patches from it.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_config
|
CAREamics DataConfig
|
(see careamics.config.data_model.DataConfig) Data configuration. |
required |
inputs
|
ndarray or list[Path]
|
Input data. |
required |
input_target
|
ndarray or list[Path]
|
Target data, by default None. |
None
|
read_source_func
|
Callable
|
Read source function for custom types, by default read_tiff. |
read_tiff
|
**kwargs
|
Any
|
Additional keyword arguments, unused. |
{}
|
get_data_statistics()
Return training data statistics.
This does not return the target data statistics, only those of the input.
Returns:
| Type | Description |
|---|---|
tuple of list of floats
|
Means and standard deviations across channels of the training data. |
split_dataset(percentage=0.1, minimum_patches=1)
Split a new dataset away from the current one.
This method is used to extract random validation patches from the dataset.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
percentage
|
float
|
Percentage of patches to extract, by default 0.1. |
0.1
|
minimum_patches
|
int
|
Minimum number of patches to extract, by default 5. |
1
|
Returns:
| Type | Description |
|---|---|
CAREamics InMemoryDataset
|
New dataset with the extracted patches. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If |
ValueError
|
If |
InMemoryPredDataset
Bases: Dataset
Simple prediction dataset returning images along the sample axis.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
prediction_config
|
InferenceConfig
|
Prediction configuration. |
required |
inputs
|
NDArray
|
Input data. |
required |
InMemoryTiledPredDataset
Bases: Dataset
Prediction dataset storing data in memory and returning tiles of each image.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
prediction_config
|
InferenceConfig
|
Prediction configuration. |
required |
inputs
|
NDArray
|
Input data. |
required |
IterablePredDataset
Bases: IterableDataset
Simple iterable prediction dataset.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
prediction_config
|
InferenceConfig
|
Inference configuration. |
required |
src_files
|
List[Path]
|
List of data files. |
required |
read_source_func
|
Callable
|
Read source function for custom types, by default read_tiff. |
read_tiff
|
**kwargs
|
Any
|
Additional keyword arguments, unused. |
{}
|
Attributes:
| Name | Type | Description |
|---|---|---|
data_path |
Union[str, Path]
|
Path to the data, must be a directory. |
axes |
str
|
Description of axes in format STCZYX. |
mean |
(Optional[float], optional)
|
Expected mean of the dataset, by default None. |
std |
(Optional[float], optional)
|
Expected standard deviation of the dataset, by default None. |
patch_transform |
(Optional[Callable], optional)
|
Patch transform callable, by default None. |
IterableTiledPredDataset
Bases: IterableDataset
Tiled prediction dataset.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
prediction_config
|
InferenceConfig
|
Inference configuration. |
required |
src_files
|
list of pathlib.Path
|
List of data files. |
required |
read_source_func
|
Callable
|
Read source function for custom types, by default read_tiff. |
read_tiff
|
**kwargs
|
Any
|
Additional keyword arguments, unused. |
{}
|
Attributes:
| Name | Type | Description |
|---|---|---|
data_path |
str or Path
|
Path to the data, must be a directory. |
axes |
str
|
Description of axes in format STCZYX. |
mean |
(float, optional)
|
Expected mean of the dataset, by default None. |
std |
(float, optional)
|
Expected standard deviation of the dataset, by default None. |
patch_transform |
(Callable, optional)
|
Patch transform callable, by default None. |
PathIterableDataset
Bases: IterableDataset
Dataset allowing extracting patches w/o loading whole data into memory.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_config
|
DataConfig
|
Data configuration. |
required |
src_files
|
list of pathlib.Path
|
List of data files. |
required |
target_files
|
list of pathlib.Path
|
Optional list of target files, by default None. |
None
|
read_source_func
|
Callable
|
Read source function for custom types, by default read_tiff. |
read_tiff
|
Attributes:
| Name | Type | Description |
|---|---|---|
data_path |
list of pathlib.Path
|
Path to the data, must be a directory. |
axes |
str
|
Description of axes in format STCZYX. |
get_data_statistics()
Return training data statistics.
Returns:
| Type | Description |
|---|---|
tuple of list of floats
|
Means and standard deviations across channels of the training data. |
get_number_of_files()
Return the number of files in the dataset.
Returns:
| Type | Description |
|---|---|
int
|
Number of files in the dataset. |
split_dataset(percentage=0.1, minimum_number=5)
Split up dataset in two.
Splits the datest sing a percentage of the data (files) to extract, or the minimum number of the percentage is less than the minimum number.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
percentage
|
float
|
Percentage of files to split up, by default 0.1. |
0.1
|
minimum_number
|
int
|
Minimum number of files to split up, by default 5. |
5
|
Returns:
| Type | Description |
|---|---|
IterableDataset
|
Dataset containing the split data. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If the percentage is smaller than 0 or larger than 1. |
ValueError
|
If the minimum number is smaller than 1 or larger than the number of files. |