Skip to content

Grouped Index Sampler

Source

Module for the GroupedIndexSampler.

GroupedIndexSampler

Bases: Sampler

A PyTorch Sampler that iterates through groups of indices.

The order of the groups and the order of indices within each group are shuffled.

This sampler is useful for iterative file loading — one file should be loaded at a time so indices belonging to the same file should be grouped, but the order of the files and the order of the indices should be shuffled.

Parameters:

  • grouped_indices (Sequence of (Sequence of int)) –

    The indices to iterate through, grouped (e.g. by file).

  • rng (Generator or None) –

    Random number generator for shuffling. If None, a default generator is used.

__init__(grouped_indices, rng)

Initialize the sampler from grouped index sequences.

Parameters:

  • grouped_indices (Sequence of (Sequence of int)) –

    The indices to iterate through, grouped (e.g. by file).

  • rng (Generator or None) –

    Random number generator for shuffling. If None, a default generator is used.

__iter__()

Iterate over indices with groups and within-group order shuffled.

Returns:

  • Iterator[int]

    Indices from all groups in shuffled group order and shuffled order within each group.

from_dataset(dataset, rng=None) classmethod

Create the sampler from a CareamicsDataset.

The grouped indices will be retrieved from the dataset's patching strategy.

Parameters:

  • dataset (CareamicsDataset) –

    An instance of the CareamicsDataset to create the sampler for.

  • rng (Generator, default: None ) –

    Random number generator used to seed the sampler. If None, a default generator is used.

Returns:

  • GroupedIndexSampler

    A sampler yielding indices grouped by the dataset's patching strategy.