Skip to content

Iterable Dataset

Source

Iterable dataset used to load data file by file.

PathIterableDataset

Bases: IterableDataset

Dataset allowing extracting patches w/o loading whole data into memory.

Parameters:

  • data_config (DataConfig) –

    Data configuration.

  • src_files (list of pathlib.Path) –

    List of data files.

  • target_files (list of pathlib.Path, default: None ) –

    Optional list of target files, by default None.

  • read_source_func (Callable, default: read_tiff ) –

    Read source function for custom types, by default read_tiff.

Attributes:

  • data_path (list of pathlib.Path) –

    Path to the data, must be a directory.

  • axes (str) –

    Description of axes in format STCZYX.

__init__(data_config, src_files, target_files=None, read_source_func=read_tiff)

Constructors.

Parameters:

  • data_config (GeneralDataConfig) –

    Data configuration.

  • src_files (list[Path]) –

    List of data files.

  • target_files (list[Path] or None, default: None ) –

    Optional list of target files, by default None.

  • read_source_func (Callable, default: read_tiff ) –

    Read source function for custom types, by default read_tiff.

__iter__()

Iterate over data source and yield single patch.

Yields:

  • ndarray

    Single patch.

get_data_statistics()

Return training data statistics.

Returns:

  • tuple of list of floats

    Means and standard deviations across channels of the training data.

get_number_of_files()

Return the number of files in the dataset.

Returns:

  • int

    Number of files in the dataset.

split_dataset(percentage=0.1, minimum_number=5)

Split up dataset in two.

Splits the datest sing a percentage of the data (files) to extract, or the minimum number of the percentage is less than the minimum number.

Parameters:

  • percentage (float, default: 0.1 ) –

    Percentage of files to split up, by default 0.1.

  • minimum_number (int, default: 5 ) –

    Minimum number of files to split up, by default 5.

Returns:

  • IterableDataset

    Dataset containing the split data.

Raises:

  • ValueError

    If the percentage is smaller than 0 or larger than 1.

  • ValueError

    If the minimum number is smaller than 1 or larger than the number of files.