Iterable Dataset
Iterable dataset used to load data file by file.
PathIterableDataset
Bases: IterableDataset
Dataset allowing extracting patches w/o loading whole data into memory.
Parameters:
-
data_config(DataConfig) –Data configuration.
-
src_files(list of pathlib.Path) –List of data files.
-
target_files(list of pathlib.Path, default:None) –Optional list of target files, by default None.
-
read_source_func(Callable, default:read_tiff) –Read source function for custom types, by default read_tiff.
Attributes:
-
data_path(list of pathlib.Path) –Path to the data, must be a directory.
-
axes(str) –Description of axes in format STCZYX.
__init__(data_config, src_files, target_files=None, read_source_func=read_tiff)
Constructors.
Parameters:
-
data_config(GeneralDataConfig) –Data configuration.
-
src_files(list[Path]) –List of data files.
-
target_files(list[Path] or None, default:None) –Optional list of target files, by default None.
-
read_source_func(Callable, default:read_tiff) –Read source function for custom types, by default read_tiff.
__iter__()
Iterate over data source and yield single patch.
Yields:
-
ndarray–Single patch.
get_data_statistics()
Return training data statistics.
Returns:
-
tuple of list of floats–Means and standard deviations across channels of the training data.
get_number_of_files()
split_dataset(percentage=0.1, minimum_number=5)
Split up dataset in two.
Splits the datest sing a percentage of the data (files) to extract, or the minimum number of the percentage is less than the minimum number.
Parameters:
-
percentage(float, default:0.1) –Percentage of files to split up, by default 0.1.
-
minimum_number(int, default:5) –Minimum number of files to split up, by default 5.
Returns:
-
IterableDataset–Dataset containing the split data.
Raises:
-
ValueError–If the percentage is smaller than 0 or larger than 1.
-
ValueError–If the minimum number is smaller than 1 or larger than the number of files.