Skip to content

Config

Source

CAREamics Pydantic configurations.

CAREAlgorithm

Bases: UNetBasedAlgorithm

CARE algorithm configuration.

Attributes:

Name Type Description
algorithm care

CARE Algorithm name.

loss {mae, mse}

CARE-compatible loss function.

algorithm = 'care' class-attribute instance-attribute

CARE Algorithm name.

loss = 'mae' class-attribute instance-attribute

CARE-compatible loss function.

lr_scheduler = LrSchedulerConfig() class-attribute instance-attribute

Learning rate scheduler to use, defined in SupportedLrScheduler.

model instance-attribute

UNet without a final activation function and without the n2v2 modifications.

optimizer = OptimizerConfig() class-attribute instance-attribute

Optimizer to use, defined in SupportedOptimizer.

get_algorithm_citations()

Return a list of citation entries of the current algorithm.

This is used to generate the model description for the BioImage Model Zoo.

Returns:

Type Description
List[CiteEntry]

List of citation entries.

get_algorithm_description()

Get the algorithm description.

Returns:

Type Description
str

Algorithm description.

get_algorithm_friendly_name()

Get the algorithm friendly name.

Returns:

Type Description
str

Friendly name of the algorithm.

get_algorithm_keywords()

Get algorithm keywords.

Returns:

Type Description
list[str]

List of keywords.

get_algorithm_references()

Get the algorithm references.

This is used to generate the README of the BioImage Model Zoo export.

Returns:

Type Description
str

Algorithm references.

get_compatible_algorithms() classmethod

Get the list of compatible algorithms.

Returns:

Type Description
list of str

List of compatible algorithms.

is_supervised() classmethod

Return whether the algorithm is supervised.

Returns:

Type Description
bool

Whether the algorithm is supervised.

CheckpointConfig

Bases: BaseModel

Checkpoint saving callback Pydantic model.

The parameters corresponds to those of pytorch_lightning.callbacks.ModelCheckpoint.

See: https://lightning.ai/docs/pytorch/stable/api/lightning.pytorch.callbacks.ModelCheckpoint.html#modelcheckpoint

auto_insert_metric_name = Field(default=False) class-attribute instance-attribute

When True, the checkpoints filenames will contain the metric name. Note that val_loss is already embedded in the default filename pattern and enabling this field will produce redundant metric names in the filename.

every_n_epochs = Field(default=None, ge=1, le=100) class-attribute instance-attribute

Number of epochs between checkpoints.

every_n_train_steps = Field(default=None, ge=1, le=1000) class-attribute instance-attribute

Number of training steps between checkpoints.

mode = Field(default='min') class-attribute instance-attribute

One of {min, max}. If save_top_k != 0, the decision to overwrite the current save file is made based on either the maximization or the minimization of the monitored quantity. For 'val_acc', this should be 'max', for 'val_loss' this should be 'min', etc.

monitor = Field(default='val_loss') class-attribute instance-attribute

Quantity to monitor, currently only val_loss.

save_last = Field(default=True) class-attribute instance-attribute

When True, saves a {experiment_name}_last.ckpt copy whenever a checkpoint file gets saved.

save_top_k = Field(default=3, ge=(-1), le=100) class-attribute instance-attribute

If save_top_k == k, the best k models according to the quantity monitored will be saved. Ifsave_top_k == 0, no models are saved. ifsave_top_k == -1`, all models are saved.

save_weights_only = Field(default=False) class-attribute instance-attribute

When True, only the model's weights will be saved (model.save_weights).

train_time_interval = Field(default=None) class-attribute instance-attribute

Checkpoints are monitored at the specified time interval.

verbose = Field(default=False) class-attribute instance-attribute

Verbosity mode.

Configuration

Bases: BaseModel

CAREamics configuration.

The configuration defines all parameters used to build and train a CAREamics model. These parameters are validated to ensure that they are compatible with each other.

It contains three sub-configurations:

  • AlgorithmModel: configuration for the algorithm training, which includes the architecture, loss function, optimizer, and other hyperparameters.
  • DataModel: configuration for the dataloader, which includes the type of data, transformations, mean/std and other parameters.
  • TrainingModel: configuration for the training, which includes the number of epochs or the callbacks.

Attributes:

Name Type Description
experiment_name str

Name of the experiment, used when saving logs and checkpoints.

algorithm AlgorithmModel

Algorithm configuration.

data DataModel

Data configuration.

training TrainingModel

Training configuration.

Methods:

Name Description
set_3D

Switch configuration between 2D and 3D.

model_dump

exclude_defaults: bool = False, exclude_none: bool = True, **kwargs: Dict ) -> Dict Export configuration to a dictionary.

Raises:

Type Description
ValueError

Configuration parameter type validation errors.

ValueError

If the experiment name contains invalid characters or is empty.

ValueError

If the algorithm is 3D but there is not "Z" in the data axes, or 2D algorithm with "Z" in data axes.

ValueError

Algorithm, data or training validation errors.

Notes

We provide convenience methods to create standards configurations, for instance:

from careamics.config import create_n2v_configuration config = create_n2v_configuration( ... experiment_name="n2v_experiment", ... data_type="array", ... axes="YX", ... patch_size=[64, 64], ... batch_size=32, ... )

The configuration can be exported to a dictionary using the model_dump method:

config_dict = config.model_dump()

Configurations can also be exported or imported from yaml files:

from careamics.config.utils.configuration_io import save_configuration from careamics.config.utils.configuration_io import load_configuration path_to_config = save_configuration(config, my_path / "config.yml") other_config = load_configuration(path_to_config)

Examples:

Minimum example:

>>> from careamics import Configuration
>>> config_dict = {
...         "experiment_name": "N2V_experiment",
...         "algorithm_config": {
...             "algorithm": "n2v",
...             "loss": "n2v",
...             "model": {
...                 "architecture": "UNet",
...             },
...         },
...         "training_config": {},
...         "data_config": {
...             "data_type": "tiff",
...             "patch_size": [64, 64],
...             "axes": "SYX",
...         },
...     }
>>> config = Configuration(**config_dict)

algorithm_config = Field(discriminator='algorithm') class-attribute instance-attribute

Algorithm configuration, holding all parameters required to configure the model.

data_config instance-attribute

Data configuration, holding all parameters required to configure the training data loader.

experiment_name instance-attribute

Name of the experiment, used to name logs and checkpoints.

training_config instance-attribute

Training configuration, holding all parameters required to configure the training process.

version = '0.1.0' class-attribute instance-attribute

CAREamics configuration version.

get_algorithm_citations()

Return a list of citation entries of the current algorithm.

This is used to generate the model description for the BioImage Model Zoo.

Returns:

Type Description
List[CiteEntry]

List of citation entries.

get_algorithm_description()

Return a description of the algorithm.

This method is used to generate the README of the BioImage Model Zoo export.

Returns:

Type Description
str

Description of the algorithm.

get_algorithm_friendly_name()

Get the algorithm name.

Returns:

Type Description
str

Algorithm name.

get_algorithm_keywords()

Get algorithm keywords.

Returns:

Type Description
list[str]

List of keywords.

get_algorithm_references()

Get the algorithm references.

This is used to generate the README of the BioImage Model Zoo export.

Returns:

Type Description
str

Algorithm references.

get_safe_experiment_name()

Return the experiment name safe for use in paths and filenames.

Spaces are replaced with underscores to avoid issues with folder creation and checkpoint naming.

Returns:

Type Description
str

Experiment name with spaces replaced with underscores.

model_dump(**kwargs)

Override model_dump method in order to set default values.

As opposed to the parent model_dump method, this method sets exclude none by default.

Parameters:

Name Type Description Default
**kwargs Any

Additional arguments to pass to the parent model_dump method.

{}

Returns:

Type Description
dict

Dictionary containing the model parameters.

no_symbol(name) classmethod

Validate experiment name.

A valid experiment name is a non-empty string with only contains letters, numbers, underscores, dashes and spaces.

Parameters:

Name Type Description Default
name str

Name to validate.

required

Returns:

Type Description
str

Validated name.

Raises:

Type Description
ValueError

If the name is empty or contains invalid characters.

set_3D(is_3D, axes, patch_size)

Set 3D flag and axes.

Parameters:

Name Type Description Default
is_3D bool

Whether the algorithm is 3D or not.

required
axes str

Axes of the data.

required
patch_size list[int]

Patch size.

required

validate_3D()

Change algorithm dimensions to match data.axes.

Returns:

Type Description
Self

Validated configuration.

validate_n2v_mask_pixel_perc()

Validate that there will always be at least one blind-spot pixel in every patch.

The probability of creating a blind-spot pixel is a function of the chosen masked pixel percentage and patch size.

Returns:

Type Description
Self

Validated configuration.

Raises:

Type Description
ValueError

If the probability of masking a pixel within a patch is less than 1 for the chosen masked pixel percentage and patch size.

DataConfig

Bases: BaseModel

Data configuration.

If std is specified, mean must be specified as well. Note that setting the std first and then the mean (if they were both None before) will raise a validation error. Prefer instead set_mean_and_std to set both at once. Means and stds are expected to be lists of floats, one for each channel. For supervised tasks, the mean and std of the target could be different from the input data.

All supported transforms are defined in the SupportedTransform enum.

Examples:

Minimum example:

>>> data = DataConfig(
...     data_type="array", # defined in SupportedData
...     patch_size=[128, 128],
...     batch_size=4,
...     axes="YX"
... )

To change the image_means and image_stds of the data:

>>> data.set_means_and_stds(image_means=[214.3], image_stds=[84.5])

One can pass also a list of transformations, by keyword, using the SupportedTransform value:

>>> from careamics.config.support import SupportedTransform
>>> data = DataConfig(
...     data_type="tiff",
...     patch_size=[128, 128],
...     batch_size=4,
...     axes="YX",
...     transforms=[
...         {
...             "name": "XYFlip",
...         }
...     ]
... )

axes instance-attribute

Axes of the data, as defined in SupportedAxes.

batch_size = Field(default=1, ge=1, validate_default=True) class-attribute instance-attribute

Batch size for training.

data_type instance-attribute

Type of input data, numpy.ndarray (array) or paths (tiff, czi, and custom), as defined in SupportedData.

image_means = Field(default=None, min_length=0, max_length=32) class-attribute instance-attribute

Means of the data across channels, used for normalization.

image_stds = Field(default=None, min_length=0, max_length=32) class-attribute instance-attribute

Standard deviations of the data across channels, used for normalization.

patch_size = Field(..., min_length=2, max_length=3) class-attribute instance-attribute

Patch size, as used during training.

target_means = Field(default=None, min_length=0, max_length=32) class-attribute instance-attribute

Means of the target data across channels, used for normalization.

target_stds = Field(default=None, min_length=0, max_length=32) class-attribute instance-attribute

Standard deviations of the target data across channels, used for normalization.

train_dataloader_params = Field(default={'shuffle': True}, validate_default=True) class-attribute instance-attribute

Dictionary of PyTorch training dataloader parameters. The dataloader parameters, should include the shuffle key, which is set to True by default. We strongly recommend to keep it as True to ensure the best training results.

transforms = Field(default=[XYFlipConfig(), XYRandomRotate90Config()], validate_default=True) class-attribute instance-attribute

List of transformations to apply to the data, available transforms are defined in SupportedTransform.

val_dataloader_params = Field(default={}, validate_default=True) class-attribute instance-attribute

Dictionary of PyTorch validation dataloader parameters.

all_elements_power_of_2_minimum_8(patch_list) classmethod

Validate patch size.

Patch size must be powers of 2 and minimum 8.

Parameters:

Name Type Description Default
patch_list list of int

Patch size.

required

Returns:

Type Description
list of int

Validated patch size.

Raises:

Type Description
ValueError

If the patch size is smaller than 8.

ValueError

If the patch size is not a power of 2.

axes_valid(axes) classmethod

Validate axes.

Axes must: - be a combination of 'STCZYX' - not contain duplicates - contain at least 2 contiguous axes: X and Y - contain at most 4 axes - not contain both S and T axes

Parameters:

Name Type Description Default
axes str

Axes to validate.

required

Returns:

Type Description
str

Validated axes.

Raises:

Type Description
ValueError

If axes are not valid.

set_3D(axes, patch_size)

Set 3D parameters.

Parameters:

Name Type Description Default
axes str

Axes.

required
patch_size list of int

Patch size.

required

set_default_pin_memory(dataloader_params) classmethod

Set default pin_memory for dataloader parameters if not provided.

  • If 'pin_memory' is not set, it defaults to True if CUDA is available.

Parameters:

Name Type Description Default
dataloader_params dict of {str: Any}

The dataloader parameters.

required

Returns:

Type Description
dict of {str: Any}

The dataloader parameters with pin_memory default applied.

set_default_train_workers(dataloader_params) classmethod

Set default num_workers for training dataloader if not provided.

  • If 'num_workers' is not set, it defaults to the number of available CPU cores.

Parameters:

Name Type Description Default
dataloader_params dict of {str: Any}

The training dataloader parameters.

required

Returns:

Type Description
dict of {str: Any}

The dataloader parameters with num_workers default applied.

set_means_and_stds(image_means, image_stds, target_means=None, target_stds=None)

Set mean and standard deviation of the data across channels.

This method should be used instead setting the fields directly, as it would otherwise trigger a validation error.

Parameters:

Name Type Description Default
image_means (ndarray, tuple or list)

Mean values for normalization.

required
image_stds (ndarray, tuple or list)

Standard deviation values for normalization.

required
target_means (ndarray, tuple or list)

Target mean values for normalization, by default ().

None
target_stds (ndarray, tuple or list)

Target standard deviation values for normalization, by default ().

None

set_val_workers_to_match_train()

Set validation dataloader num_workers to match training dataloader.

If num_workers is not specified in val_dataloader_params, it will be set to the same value as train_dataloader_params["num_workers"].

Returns:

Type Description
Self

Validated data model with synchronized num_workers.

shuffle_train_dataloader(train_dataloader_params) classmethod

Validate that "shuffle" is included in the training dataloader params.

A warning will be raised if shuffle=False.

Parameters:

Name Type Description Default
train_dataloader_params dict of {str: Any}

The training dataloader parameters.

required

Returns:

Type Description
dict of {str: Any}

The validated training dataloader parameters.

Raises:

Type Description
ValueError

If "shuffle" is not included in the training dataloader params.

std_only_with_mean()

Check that mean and std are either both None, or both specified.

Returns:

Type Description
Self

Validated data model.

Raises:

Type Description
ValueError

If std is not None and mean is None.

validate_dimensions()

Validate 2D/3D dimensions between axes, patch size and transforms.

Returns:

Type Description
Self

Validated data model.

Raises:

Type Description
ValueError

If the transforms are not valid.

GaussianMixtureNMConfig

Bases: BaseModel

Gaussian mixture noise model.

max_signal = Field(default=1.0, ge=0.0) class-attribute instance-attribute

Maximum signal intensity expected in the image.

min_sigma = Field(default=125.0, ge=0.0) class-attribute instance-attribute

Minimum value of standard deviation allowed in the GMM. All values of standard deviation below this are clamped to this value.

min_signal = Field(default=0.0, ge=0.0) class-attribute instance-attribute

Minimum signal intensity expected in the image.

n_coeff = Field(default=2, ge=2) class-attribute instance-attribute

Number of coefficients to describe the functional relationship between gaussian parameters and the signal. 2 implies a linear relationship, 3 implies a quadratic relationship and so on.

n_gaussian = Field(default=1, ge=1) class-attribute instance-attribute

Number of gaussians used for the GMM.

observation = Field(default=None, exclude=True) class-attribute instance-attribute

Path to the file containing observation or respective numpy array.

path = None class-attribute instance-attribute

Path to the directory where the trained noise model (*.npz) is saved in the train method.

signal = Field(default=None, exclude=True) class-attribute instance-attribute

Path to the file containing signal or respective numpy array.

tol = Field(default=1e-10) class-attribute instance-attribute

Tolerance used in the computation of the noise model likelihood.

weight = None class-attribute instance-attribute

A [3*n_gaussian, n_coeff] sized array containing the values of the weights describing the GMM noise model, with each row corresponding to one parameter of each gaussian, namely [mean, standard deviation and weight]. Specifically, rows are organized as follows: - first n_gaussian rows correspond to the means - next n_gaussian rows correspond to the weights - last n_gaussian rows correspond to the standard deviations If weight=None, the weight array is initialized using the min_signal and max_signal parameters.

validate_path()

Validate that the path points to a valid .npz file if provided.

Returns:

Type Description
Self

Returns itself.

Raises:

Type Description
ValueError

If the path is provided but does not point to a valid .npz file.

HDNAlgorithm

Bases: VAEBasedAlgorithm

HDN algorithm configuration.

optimizer = OptimizerConfig() class-attribute instance-attribute

Optimizer to use, defined in SupportedOptimizer.

algorithm_cross_validation()

Validate the algorithm model based on algorithm.

Returns:

Type Description
Self

The validated model.

get_algorithm_citations()

Return a list of citation entries of the current algorithm.

This is used to generate the model description for the BioImage Model Zoo.

Returns:

Type Description
List[CiteEntry]

List of citation entries.

get_algorithm_description()

Get the algorithm description.

Returns:

Type Description
str

Algorithm description.

get_algorithm_friendly_name()

Get the algorithm friendly name.

Returns:

Type Description
str

Friendly name of the algorithm.

get_algorithm_keywords()

Get algorithm keywords.

Returns:

Type Description
list[str]

List of keywords.

get_algorithm_references()

Get the algorithm references.

This is used to generate the README of the BioImage Model Zoo export.

Returns:

Type Description
str

Algorithm references.

get_compatible_algorithms() classmethod

Get the list of compatible algorithms.

Returns:

Type Description
list of str

List of compatible algorithms.

output_channels_validation()

Validate the consistency between number of out channels and noise models.

Returns:

Type Description
Self

The validated model.

predict_logvar_validation()

Validate the consistency of predict_logvar throughout the model.

Returns:

Type Description
Self

The validated model.

InferenceConfig

Bases: BaseModel

Configuration class for the prediction model.

axes instance-attribute

Data axes (TSCZYX) in the order of the input data.

batch_size = Field(default=1, ge=1) class-attribute instance-attribute

Batch size for prediction.

data_type instance-attribute

Type of input data: numpy.ndarray (array) or path (tiff, czi, or custom).

image_means = Field(..., min_length=0, max_length=32) class-attribute instance-attribute

Mean values for each input channel.

image_stds = Field(..., min_length=0, max_length=32) class-attribute instance-attribute

Standard deviation values for each input channel.

tile_overlap = Field(default=None, min_length=2, max_length=3) class-attribute instance-attribute

Overlap between tiles, only effective if tile_size is specified.

tile_size = Field(default=None, min_length=2, max_length=3) class-attribute instance-attribute

Tile size of prediction, only effective if tile_overlap is specified.

tta_transforms = Field(default=True) class-attribute instance-attribute

Whether to apply test-time augmentation (all 90 degrees rotations and flips).

all_elements_non_zero_even(tile_overlap) classmethod

Validate tile overlap.

Overlaps must be non-zero, positive and even.

Parameters:

Name Type Description Default
tile_overlap list[int] or None

Patch size.

required

Returns:

Type Description
list[int] or None

Validated tile overlap.

Raises:

Type Description
ValueError

If the patch size is 0.

ValueError

If the patch size is not even.

axes_valid(axes) classmethod

Validate axes.

Axes must: - be a combination of 'STCZYX' - not contain duplicates - contain at least 2 contiguous axes: X and Y - contain at most 4 axes - not contain both S and T axes

Parameters:

Name Type Description Default
axes str

Axes to validate.

required

Returns:

Type Description
str

Validated axes.

Raises:

Type Description
ValueError

If axes are not valid.

set_3D(axes, tile_size, tile_overlap)

Set 3D parameters.

Parameters:

Name Type Description Default
axes str

Axes.

required
tile_size list of int

Tile size.

required
tile_overlap list of int

Tile overlap.

required

std_only_with_mean()

Check that mean and std are either both None, or both specified.

Returns:

Type Description
Self

Validated prediction model.

Raises:

Type Description
ValueError

If std is not None and mean is None.

tile_min_8_power_of_2(tile_list) classmethod

Validate that each entry is greater or equal than 8 and a power of 2.

Parameters:

Name Type Description Default
tile_list list of int

Patch size.

required

Returns:

Type Description
list of int

Validated patch size.

Raises:

Type Description
ValueError

If the patch size if smaller than 8.

ValueError

If the patch size is not a power of 2.

validate_dimensions()

Validate 2D/3D dimensions between axes and tile size.

Returns:

Type Description
Self

Validated prediction model.

LVAEConfig

Bases: ArchitectureConfig

LVAE model.

decoder_conv_strides = Field(default=[2, 2], validate_default=True) class-attribute instance-attribute

Dimensions (2D or 3D) of the convolutional layers.

input_shape = Field(default=(64, 64), validate_default=True) class-attribute instance-attribute

Shape of the input patch (Z, Y, X) or (Y, X) if the data is 2D.

is_3D()

Return whether the model is 3D or not.

Returns:

Type Description
bool

Whether the model is 3D or not.

model_dump(**kwargs)

Dump the model as a dictionary, ignoring the architecture keyword.

Parameters:

Name Type Description Default
**kwargs Any

Additional keyword arguments from Pydantic BaseModel model_dump method.

{}

Returns:

Type Description
{str: Any}

Model as a dictionary.

set_3D(is_3D)

Set 3D model by setting the conv_dims parameters.

Parameters:

Name Type Description Default
is_3D bool

Whether the algorithm is 3D or not.

required

validate_conv_strides()

Validate the convolutional strides.

Returns:

Type Description
list

Validated strides.

Raises:

Type Description
ValueError

If the number of strides is not 2.

validate_decoder_even(decoder_n_filters) classmethod

Validate that num_channels_init is even.

Parameters:

Name Type Description Default
decoder_n_filters int

Number of channels.

required

Returns:

Type Description
int

Validated number of channels.

Raises:

Type Description
ValueError

If the number of channels is odd.

validate_encoder_even(encoder_n_filters) classmethod

Validate that num_channels_init is even.

Parameters:

Name Type Description Default
encoder_n_filters int

Number of channels.

required

Returns:

Type Description
int

Validated number of channels.

Raises:

Type Description
ValueError

If the number of channels is odd.

validate_input_shape(input_shape) classmethod

Validate the input shape.

Parameters:

Name Type Description Default
input_shape list

Shape of the input patch.

required

Returns:

Type Description
list

Validated input shape.

Raises:

Type Description
ValueError

If the number of dimensions is not 3 or 4.

validate_multiscale_count()

Validate the multiscale count.

Returns:

Type Description
Self

The validated model.

validate_z_dims(z_dims)

Validate the z_dims.

Parameters:

Name Type Description Default
z_dims tuple

Tuple of z dimensions.

required

Returns:

Type Description
tuple

Validated z dimensions.

Raises:

Type Description
ValueError

If the number of z dimensions is not 4.

LVAELossConfig

Bases: BaseModel

LVAE loss configuration.

denoisplit_weight = 0.9 class-attribute instance-attribute

Weight for the denoiSplit loss (used in the muSplit-deonoiSplit loss).

kl_params = KLLossConfig() class-attribute instance-attribute

KL loss configuration.

kl_weight = 1.0 class-attribute instance-attribute

Weight for the KL loss in the total net loss. (i.e., net_loss = reconstruction_weight * rec_loss + kl_weight * kl_loss).

loss_type instance-attribute

Type of loss to use for LVAE.

musplit_weight = 0.1 class-attribute instance-attribute

Weight for the muSplit loss (used in the muSplit-denoiSplit loss).

non_stochastic = False class-attribute instance-attribute

Whether to sample latents and compute KL.

reconstruction_weight = 1.0 class-attribute instance-attribute

Weight for the reconstruction loss in the total net loss (i.e., net_loss = reconstruction_weight * rec_loss + kl_weight * kl_loss).

MicroSplitAlgorithm

Bases: VAEBasedAlgorithm

MicroSplit algorithm configuration.

optimizer = OptimizerConfig() class-attribute instance-attribute

Optimizer to use, defined in SupportedOptimizer.

algorithm_cross_validation()

Validate the algorithm model based on algorithm.

Returns:

Type Description
Self

The validated model.

get_algorithm_citations()

Return a list of citation entries of the current algorithm.

This is used to generate the model description for the BioImage Model Zoo.

Returns:

Type Description
List[CiteEntry]

List of citation entries.

get_algorithm_description()

Get the algorithm description.

Returns:

Type Description
str

Algorithm description.

get_algorithm_friendly_name()

Get the algorithm friendly name.

Returns:

Type Description
str

Friendly name of the algorithm.

get_algorithm_keywords()

Get algorithm keywords.

Returns:

Type Description
list[str]

List of keywords.

get_algorithm_references()

Get the algorithm references.

This is used to generate the README of the BioImage Model Zoo export.

Returns:

Type Description
str

Algorithm references.

get_compatible_algorithms() classmethod

Get the list of compatible algorithms.

Returns:

Type Description
list of str

List of compatible algorithms.

output_channels_validation()

Validate the consistency between number of out channels and noise models.

Returns:

Type Description
Self

The validated model.

predict_logvar_validation()

Validate the consistency of predict_logvar throughout the model.

Returns:

Type Description
Self

The validated model.

MultiChannelNMConfig

Bases: BaseModel

Noise Model config aggregating noise models for single output channels.

noise_models instance-attribute

List of noise models, one for each target channel.

N2NAlgorithm

Bases: UNetBasedAlgorithm

Noise2Noise Algorithm configuration.

algorithm = 'n2n' class-attribute instance-attribute

N2N Algorithm name.

loss = 'mae' class-attribute instance-attribute

N2N-compatible loss function.

lr_scheduler = LrSchedulerConfig() class-attribute instance-attribute

Learning rate scheduler to use, defined in SupportedLrScheduler.

model instance-attribute

UNet without a final activation function and without the n2v2 modifications.

optimizer = OptimizerConfig() class-attribute instance-attribute

Optimizer to use, defined in SupportedOptimizer.

get_algorithm_citations()

Return a list of citation entries of the current algorithm.

This is used to generate the model description for the BioImage Model Zoo.

Returns:

Type Description
List[CiteEntry]

List of citation entries.

get_algorithm_description()

Get the algorithm description.

Returns:

Type Description
str

Algorithm description.

get_algorithm_friendly_name()

Get the algorithm friendly name.

Returns:

Type Description
str

Friendly name of the algorithm.

get_algorithm_keywords()

Get algorithm keywords.

Returns:

Type Description
list[str]

List of keywords.

get_algorithm_references()

Get the algorithm references.

This is used to generate the README of the BioImage Model Zoo export.

Returns:

Type Description
str

Algorithm references.

get_compatible_algorithms() classmethod

Get the list of compatible algorithms.

Returns:

Type Description
list of str

List of compatible algorithms.

is_supervised() classmethod

Return whether the algorithm is supervised.

Returns:

Type Description
bool

Whether the algorithm is supervised.

N2VAlgorithm

Bases: UNetBasedAlgorithm

N2V Algorithm configuration.

algorithm = 'n2v' class-attribute instance-attribute

N2V Algorithm name.

loss = 'n2v' class-attribute instance-attribute

N2V loss function.

lr_scheduler = LrSchedulerConfig() class-attribute instance-attribute

Learning rate scheduler to use, defined in SupportedLrScheduler.

monitor_metric = 'val_loss' class-attribute instance-attribute

Metric to monitor for the learning rate scheduler.

optimizer = OptimizerConfig() class-attribute instance-attribute

Optimizer to use, defined in SupportedOptimizer.

get_algorithm_citations()

Return a list of citation entries of the current algorithm.

This is used to generate the model description for the BioImage Model Zoo.

Returns:

Type Description
List[CiteEntry]

List of citation entries.

get_algorithm_description()

Return a description of the algorithm.

This method is used to generate the README of the BioImage Model Zoo export.

Returns:

Type Description
str

Description of the algorithm.

get_algorithm_friendly_name()

Get the friendly name of the algorithm.

Returns:

Type Description
str

Friendly name.

get_algorithm_keywords()

Get algorithm keywords.

Returns:

Type Description
list[str]

List of keywords.

get_algorithm_references()

Get the algorithm references.

This is used to generate the README of the BioImage Model Zoo export.

Returns:

Type Description
str

Algorithm references.

get_compatible_algorithms() classmethod

Get the list of compatible algorithms.

Returns:

Type Description
list of str

List of compatible algorithms.

is_struct_n2v()

Check if the configuration is using structN2V.

Returns:

Type Description
bool

Whether the configuration is using structN2V.

is_supervised() classmethod

Return whether the algorithm is supervised.

Returns:

Type Description
bool

Whether the algorithm is supervised.

set_n2v2(use_n2v2)

Set the configuration to use N2V2 or the vanilla Noise2Void.

This method ensures that N2V2 is set correctly and remain coherent, as opposed to setting the different parameters individually.

Parameters:

Name Type Description Default
use_n2v2 bool

Whether to use N2V2.

required

validate_n2v2()

Validate that the N2V2 strategy and models are set correctly.

Returns:

Type Description
Self

The validateed configuration.

Raises:

Type Description
ValueError

If N2V2 is used with the wrong pixel manipulation strategy.

NGDataConfig

Bases: BaseModel

Next-Generation Dataset configuration.

NGDataConfig are used for both training and prediction, with the patching strategy determining how the data is processed. Note that random is the only patching strategy compatible with training, while tiled and whole are only used for prediction.

All supported transforms are defined in the SupportedTransform enum.

augmentations = Field(default=(XYFlipConfig(), XYRandomRotate90Config()), validate_default=True) class-attribute instance-attribute

List of augmentations to apply to the data, available transforms are defined in SupportedTransform.

axes instance-attribute

Axes of the data, as defined in SupportedAxes.

batch_size = Field(default=1, ge=1, validate_default=True) class-attribute instance-attribute

Batch size for training.

channels = Field(default=None) class-attribute instance-attribute

Channels to use from the data. If None, all channels are used.

coord_filter = Field(default=None, discriminator='name') class-attribute instance-attribute

Coordinate filter to apply when using random patching. Only available if mode is training.

data_type instance-attribute

Type of input data.

in_memory = Field(default_factory=default_in_memory, validate_default=True) class-attribute instance-attribute

Whether to load all data into memory. This is only supported for 'array', 'tiff' and 'custom' data types. Must be True for array. If None, defaults to True for 'array', 'tiff' and custom, and False for 'zarr' and 'czi' data types.

mode instance-attribute

Dataset mode, either training, validating or predicting.

n_val_patches = Field(default=8, ge=0, validate_default=True) class-attribute instance-attribute

The number of patches to set aside for validation during training. This parameter will be ignored if separate validation data is specified for training.

normalization = Field(...) class-attribute instance-attribute

Normalization configuration to use.

patch_filter = Field(default=None, discriminator='name') class-attribute instance-attribute

Patch filter to apply when using random patching. Only available if mode is training.

patch_filter_patience = Field(default=5, ge=1) class-attribute instance-attribute

Number of consecutive patches not passing the filter before accepting the next patch.

patching = Field(..., discriminator='name') class-attribute instance-attribute

Patching strategy to use. Note that random is the only supported strategy for training, while tiled and whole are only used for prediction.

pred_dataloader_params = Field(default={}) class-attribute instance-attribute

Dictionary of PyTorch prediction dataloader parameters.

seed = Field(default_factory=generate_random_seed, gt=0) class-attribute instance-attribute

Random seed for reproducibility. If not specified, a random seed is generated.

train_dataloader_params = Field(default={'shuffle': True}, validate_default=True) class-attribute instance-attribute

Dictionary of PyTorch training dataloader parameters. The dataloader parameters, should include the shuffle key, which is set to True by default. We strongly recommend to keep it as True to ensure the best training results.

val_dataloader_params = Field(default={}) class-attribute instance-attribute

Dictionary of PyTorch validation dataloader parameters.

axes_valid(axes, info) classmethod

Validate axes.

Axes must: - be a combination of 'STCZYX' - not contain duplicates - contain at least 2 contiguous axes: X and Y - contain at most 4 axes - not contain both S and T axes

Parameters:

Name Type Description Default
axes str

Axes to validate.

required
info ValidationInfo

Validation information.

required

Returns:

Type Description
str

Validated axes.

Raises:

Type Description
ValueError

If axes are not valid.

batch_size_not_in_dataloader_params(dataloader_params) classmethod

Validate that batch_size is not set in the dataloader parameters.

batch_size must be set through batch_size field, not through the dataloader parameters.

Parameters:

Name Type Description Default
dataloader_params dict of {str: Any}

The dataloader parameters.

required

Returns:

Type Description
dict of {str: Any}

The validated dataloader parameters.

Raises:

Type Description
ValueError

If batch_size is present in the dataloader parameters.

convert_mode(new_mode, new_patch_size=None, overlap_size=None, new_batch_size=None, new_data_type=None, new_axes=None, new_channels=None, new_in_memory=None, new_dataloader_params=None)

Convert a training dataset configuration to a different mode.

This method is intended to facilitate creating validation or prediction configurations from a training configuration.

To perform tile prediction when switching to predicting mode, please provide both new_patch_size and overlap_size. Switching mode to predicting without specifying new_patch_size and overlap_size will apply the default patching strategy, namely whole image strategy. new_patch_size and overlap_size are only used when switching to predicting.

channels=None will retain the same channels as in the current configuration. To select all channels, please specify all channels explicitly or pass channels='all'.

New dataloader parameters will be placed in the appropriate dataloader params field depending on the new mode.

To create a new training configuration, please use careamics.config.create_ng_data_configuration.

This method compares the new parameters with the current ones and raises errors if incompatible changes are requested, such as switching between 2D and 3D axes, or changing the number of channels. Incompatibility across parameters may be delegated to Pydantic validation.

Parameters:

Name Type Description Default
new_mode Literal['validating', 'predicting']

The new dataset mode, one of validating or predicting.

required
new_patch_size Sequence of int

New patch size. If None for predicting, uses default whole image strategy.

None
overlap_size Sequence of int

New overlap size. Necessary when switching to predicting with tiled patching.

None
new_batch_size int

New batch size.

None
new_data_type Literal['array', 'tiff', 'zarr', 'czi', 'custom']

New data type.

None
new_axes str

New axes.

None
new_channels Sequence of int or "all"

New channels.

None
new_in_memory bool

New in_memory value.

None
new_dataloader_params dict of {str: Any}

New dataloader parameters. These will be placed in the appropriate dataloader params field depending on the new mode.

None

Returns:

Type Description
NGDataConfig

New NGDataConfig with the updated mode and parameters.

Raises:

Type Description
ValueError

If conversion to training mode is requested, or if incompatible changes are requested.

is_3D()

Check if the data is 3D based on the axes.

Either "Z" is in the axes and patching patch_size has 3 dimensions, or for CZI data, "Z" is in the axes or "T" is in the axes and patching patch_size has 3 dimensions.

This method is used during NGConfiguration validation to cross checks dimensions with the algorithm configuration.

Returns:

Type Description
bool

True if the data is 3D, False otherwise.

propagate_seed_to_augmentations()

Propagate the main seed to all augmentations that support seeds.

This ensures that all augmentations use the same seed for reproducibility, unless they already have a seed explicitly set.

Returns:

Type Description
Self

Data model with propagated seeds.

propagate_seed_to_filters()

Propagate the main seed to patch and coordinate filters that support seeds.

This ensures that all filters use the same seed for reproducibility, unless they already have a seed explicitly set.

Returns:

Type Description
Self

Data model with propagated seeds.

propagate_seed_to_patching()

Propagate the main seed to the patching strategy if it supports seeds.

This ensures that the patching strategy uses the same seed for reproducibility, unless it already has a seed explicitly set.

Returns:

Type Description
Self

Data model with propagated seed.

set_default_pin_memory(dataloader_params) classmethod

Set default pin_memory for dataloader parameters if not provided.

  • If 'pin_memory' is not set, it defaults to True if CUDA is available.

Parameters:

Name Type Description Default
dataloader_params dict of {str: Any}

The dataloader parameters.

required

Returns:

Type Description
dict of {str: Any}

The dataloader parameters with pin_memory default applied.

set_default_train_workers(dataloader_params) classmethod

Set default num_workers for training dataloader if not provided.

  • If 'num_workers' is not set, it defaults to the number of available CPU cores.

Parameters:

Name Type Description Default
dataloader_params dict of {str: Any}

The training dataloader parameters.

required

Returns:

Type Description
dict of {str: Any}

The dataloader parameters with num_workers default applied.

set_val_workers_to_match_train()

Set validation dataloader num_workers to match training dataloader.

If num_workers is not specified in val_dataloader_params, it will be set to the same value as train_dataloader_params["num_workers"].

Returns:

Type Description
Self

Validated data model with synchronized num_workers.

shuffle_train_dataloader(train_dataloader_params) classmethod

Validate that "shuffle" is included in the training dataloader params.

A warning will be raised if shuffle=False.

Parameters:

Name Type Description Default
train_dataloader_params dict of {str: Any}

The training dataloader parameters.

required

Returns:

Type Description
dict of {str: Any}

The validated training dataloader parameters.

Raises:

Type Description
ValueError

If "shuffle" is not included in the training dataloader params.

validate_channels(channels, info) classmethod

Validate channels.

Channels must be a sequence of non-negative integers without duplicates. If channels are not None, then C must be present in the axes.

Parameters:

Name Type Description Default
channels Sequence of int or None

Channels to validate.

required
info ValidationInfo

Validation information.

required

Returns:

Type Description
Sequence of int or None

Validated channels.

Raises:

Type Description
ValueError

If channels are not valid.

validate_dimensions()

Validate 2D/3D dimensions between axes and patch size.

Returns:

Type Description
Self

Validated data model.

Raises:

Type Description
ValueError

If the patch size dimension is not compatible with the axes.

validate_filters_against_mode(filter_obj, info) classmethod

Validate that the filters are only used during training.

Parameters:

Name Type Description Default
filter_obj PatchFilters or CoordFilters or None

Filter to validate.

required
info ValidationInfo

Validation information.

required

Returns:

Type Description
PatchFilters or CoordFilters or None

Validated filter.

Raises:

Type Description
ValueError

If a filter is used in a mode other than training.

validate_in_memory_with_data_type(in_memory, info) classmethod

Validate that in_memory is compatible with data_type.

in_memory can only be True for 'array', 'tiff' and 'custom' data types.

Parameters:

Name Type Description Default
in_memory bool

Whether to load data into memory.

required
info Any

Additional information about the field being validated.

required

Returns:

Type Description
bool

Validated in_memory value.

Raises:

Type Description
ValueError

If in_memory is True for unsupported data types.

validate_patching_strategy_against_mode(patching, info) classmethod

Validate that the patching strategy is compatible with the dataset mode.

  • If mode is training, patching strategy must be random or stratified.
  • If mode is validating, patching must be fixed_random.
  • If mode is predicting, patching strategy must be tiled or whole.

Parameters:

Name Type Description Default
patching PatchingStrategies

Patching strategy to validate.

required
info ValidationInfo

Validation information.

required

Returns:

Type Description
PatchingStrategies

Validated patching strategy.

Raises:

Type Description
ValueError

If the patching strategy is not compatible with the dataset mode.

PN2VAlgorithm

Bases: UNetBasedAlgorithm

PN2V Algorithm configuration.

algorithm = 'pn2v' class-attribute instance-attribute

PN2V Algorithm name.

loss = 'pn2v' class-attribute instance-attribute

PN2V loss function (uses N2V loss with noise model).

lr_scheduler = LrSchedulerConfig() class-attribute instance-attribute

Learning rate scheduler to use, defined in SupportedLrScheduler.

noise_model instance-attribute

Noise model configuration for probabilistic denoising.

optimizer = OptimizerConfig() class-attribute instance-attribute

Optimizer to use, defined in SupportedOptimizer.

get_algorithm_citations()

Return a list of citation entries of the current algorithm.

This is used to generate the model description for the BioImage Model Zoo.

Returns:

Type Description
List[CiteEntry]

List of citation entries.

get_algorithm_description()

Return a description of the algorithm.

This method is used to generate the README of the BioImage Model Zoo export.

Returns:

Type Description
str

Description of the algorithm.

get_algorithm_friendly_name()

Get the friendly name of the algorithm.

Returns:

Type Description
str

Friendly name.

get_algorithm_keywords()

Get algorithm keywords.

Returns:

Type Description
list[str]

List of keywords.

get_algorithm_references()

Get the algorithm references.

This is used to generate the README of the BioImage Model Zoo export.

Returns:

Type Description
str

Algorithm references.

get_compatible_algorithms() classmethod

Get the list of compatible algorithms.

Returns:

Type Description
list of str

List of compatible algorithms.

is_struct_n2v()

Check if the configuration is using structPN2V.

Returns:

Type Description
bool

Whether the configuration is using structPN2V.

set_n2v2(use_n2v2)

Set the configuration to use PN2V2 or the vanilla Probabilistic Noise2Void.

This method ensures that PN2V2 is set correctly and remain coherent, as opposed to setting the different parameters individually.

Parameters:

Name Type Description Default
use_n2v2 bool

Whether to use PN2V2.

required

validate_n2v2()

Validate that the N2V2 strategy and models are set correctly.

Returns:

Type Description
Self

The validated configuration.

Raises:

Type Description
ValueError

If N2V2 is used with the wrong pixel manipulation strategy.

TrainingConfig

Bases: BaseModel

Parameters related to the training.

Mandatory parameters are: - num_epochs: number of epochs, greater than 0. - batch_size: batch size, greater than 0. - augmentation: whether to use data augmentation or not (True or False).

Attributes:

Name Type Description
num_epochs int

Number of epochs, greater than 0.

checkpoint_callback = CheckpointConfig() class-attribute instance-attribute

Checkpoint callback configuration, following PyTorch Lightning Checkpoint callback.

early_stopping_callback = Field(default=None, validate_default=True) class-attribute instance-attribute

Early stopping callback configuration, following PyTorch Lightning Checkpoint callback.

lightning_trainer_config = Field(default={}) class-attribute instance-attribute

Configuration for the PyTorch Lightning Trainer, following PyTorch Lightning Trainer class

logger = None class-attribute instance-attribute

Logger to use during training. If None, no logger will be used. Available loggers are defined in SupportedLogger.

has_logger()

Check if the logger is defined.

Returns:

Type Description
bool

Whether the logger is defined or not.

UNetBasedAlgorithm

Bases: BaseModel

General UNet-based algorithm configuration.

This Pydantic model validates the parameters governing the components of the training algorithm: which algorithm, loss function, model architecture, optimizer, and learning rate scheduler to use.

Currently, we only support N2V, CARE, N2N, and PN2V algorithms. In order to train these algorithms, use the corresponding configuration child classes (e.g. N2VAlgorithm) to ensure coherent parameters (e.g. specific losses).

Attributes:

Name Type Description
algorithm {n2v, care, n2n, pn2v}

Algorithm to use.

loss {n2v, mae, mse}

Loss function to use.

model UNetConfig

Model architecture to use.

optimizer (OptimizerConfig, optional)

Optimizer to use.

lr_scheduler (LrSchedulerConfig, optional)

Learning rate scheduler to use.

Raises:

Type Description
ValueError

Algorithm parameter type validation errors.

ValueError

If the algorithm, loss and model are not compatible.

algorithm instance-attribute

Algorithm name, as defined in SupportedAlgorithm.

loss instance-attribute

Loss function to use, as defined in SupportedLoss.

lr_scheduler = LrSchedulerConfig() class-attribute instance-attribute

Learning rate scheduler to use, defined in SupportedLrScheduler.

model instance-attribute

UNet model configuration.

optimizer = OptimizerConfig() class-attribute instance-attribute

Optimizer to use, defined in SupportedOptimizer.

get_compatible_algorithms() classmethod

Get the list of compatible algorithms.

Returns:

Type Description
list of str

List of compatible algorithms.

UNetConfig

Bases: ArchitectureConfig

Pydantic model for a N2V(2)-compatible UNet.

Attributes:

Name Type Description
depth int

Depth of the model, between 1 and 10 (default 2).

num_channels_init int

Number of filters of the first level of the network, should be even and minimum 8 (default 96).

architecture instance-attribute

Name of the architecture.

conv_dims = Field(default=2, validate_default=True) class-attribute instance-attribute

Dimensions (2D or 3D) of the convolutional layers.

depth = Field(default=2, ge=1, le=10, validate_default=True) class-attribute instance-attribute

Number of levels in the UNet.

final_activation = Field(default='None', validate_default=True) class-attribute instance-attribute

Final activation function.

in_channels = Field(default=1, ge=1, validate_default=True) class-attribute instance-attribute

Number of channels in the input to the model.

independent_channels = Field(default=True, validate_default=True) class-attribute instance-attribute

Whether information is processed independently in each channel, used to train channels independently.

n2v2 = Field(default=False, validate_default=True) class-attribute instance-attribute

Whether to use N2V2 architecture modifications, with blur pool layers and fewer skip connections.

num_channels_init = Field(default=32, ge=8, le=1024, validate_default=True) class-attribute instance-attribute

Number of convolutional filters in the first layer of the UNet.

num_classes = Field(default=1, ge=1, validate_default=True) class-attribute instance-attribute

Number of classes or channels in the model output.

residual = Field(default=False, validate_default=True) class-attribute instance-attribute

Whether to add a residual connection from the input to the output.

use_batch_norm = Field(default=True, validate_default=True) class-attribute instance-attribute

Whether to use batch normalization in the model.

is_3D()

Return whether the model is 3D or not.

This method is used in the NG configuration validation to check that the model dimensions match the data dimensions.

Returns:

Type Description
bool

Whether the model is 3D or not.

model_dump(**kwargs)

Dump the model as a dictionary, ignoring the architecture keyword.

Parameters:

Name Type Description Default
**kwargs Any

Additional keyword arguments from Pydantic BaseModel model_dump method.

{}

Returns:

Type Description
{str: Any}

Model as a dictionary.

set_3D(is_3D)

Set 3D model by setting the conv_dims parameters.

Parameters:

Name Type Description Default
is_3D bool

Whether the algorithm is 3D or not.

required

validate_num_channels_init(num_channels_init) classmethod

Validate that num_channels_init is even.

Parameters:

Name Type Description Default
num_channels_init int

Number of channels.

required

Returns:

Type Description
int

Validated number of channels.

Raises:

Type Description
ValueError

If the number of channels is odd.

VAEBasedAlgorithm

Bases: BaseModel

VAE-based algorithm configuration.

TODO

Examples:

TODO add once finalized

optimizer = OptimizerConfig() class-attribute instance-attribute

Optimizer to use, defined in SupportedOptimizer.

algorithm_cross_validation()

Validate the algorithm model based on algorithm.

Returns:

Type Description
Self

The validated model.

get_compatible_algorithms() classmethod

Get the list of compatible algorithms.

Returns:

Type Description
list of str

List of compatible algorithms.

output_channels_validation()

Validate the consistency between number of out channels and noise models.

Returns:

Type Description
Self

The validated model.

predict_logvar_validation()

Validate the consistency of predict_logvar throughout the model.

Returns:

Type Description
Self

The validated model.

algorithm_factory(algorithm)

Create an algorithm model for training CAREamics.

Parameters:

Name Type Description Default
algorithm dict

Algorithm dictionary.

required

Returns:

Type Description
N2VAlgorithm or N2NAlgorithm or CAREAlgorithm or PN2VAlgorithm

Algorithm model for training CAREamics.

create_care_configuration(experiment_name, data_type, axes, patch_size, batch_size, num_epochs=100, num_steps=None, augmentations=None, independent_channels=True, loss='mae', n_channels_in=None, n_channels_out=None, logger='none', trainer_params=None, model_params=None, optimizer='Adam', optimizer_params=None, lr_scheduler='ReduceLROnPlateau', lr_scheduler_params=None, train_dataloader_params=None, val_dataloader_params=None, checkpoint_params=None)

Create a configuration for training CARE.

If "Z" is present in axes, then patch_size must be a list of length 3, otherwise 2.

If "C" is present in axes, then you need to set n_channels_in to the number of channels. Likewise, if you set the number of channels, then "C" must be present in axes.

To set the number of output channels, use the n_channels_out parameter. If it is not specified, it will be assumed to be equal to n_channels_in.

By default, all channels are trained together. To train all channels independently, set independent_channels to True.

By setting augmentations to None, the default augmentations (flip in X and Y, rotations by 90 degrees in the XY plane) are applied. Rather than the default augmentations, a list of augmentations can be passed to the augmentations parameter. To disable the augmentations, simply pass an empty list.

Parameters:

Name Type Description Default
experiment_name str

Name of the experiment.

required
data_type Literal['array', 'tiff', 'czi', 'custom']

Type of the data.

required
axes str

Axes of the data (e.g. SYX).

required
patch_size List[int]

Size of the patches along the spatial dimensions (e.g. [64, 64]).

required
batch_size int

Batch size.

required
num_epochs int

Number of epochs to train for. If provided, this will be added to trainer_params.

100
num_steps int

Number of batches in 1 epoch. If provided, this will be added to trainer_params. Translates to limit_train_batches in PyTorch Lightning Trainer. See relevant documentation for more details.

None
augmentations list of augmentations

List of augmentations to apply, either both or one of XYFlipConfig and XYRandomRotate90Config. By default, it applies both XYFlip (on X and Y) and XYRandomRotate90 (in XY) to the images.

None
independent_channels bool

Whether to train all channels independently, by default False.

True
loss Literal['mae', 'mse']

Loss function to use.

"mae"
n_channels_in int or None

Number of channels in.

None
n_channels_out int or None

Number of channels out.

None
logger Literal['wandb', 'tensorboard', 'none']

Logger to use.

"none"
trainer_params dict

Parameters for the trainer class, see PyTorch Lightning documentation.

None
model_params dict

UNetModel parameters.

None
optimizer Literal['Adam', 'Adamax', 'SGD']

Optimizer to use.

"Adam"
optimizer_params dict

Parameters for the optimizer, see PyTorch documentation for more details.

None
lr_scheduler Literal['ReduceLROnPlateau', 'StepLR']

Learning rate scheduler to use.

"ReduceLROnPlateau"
lr_scheduler_params dict

Parameters for the learning rate scheduler, see PyTorch documentation for more details.

None
train_dataloader_params dict

Parameters for the training dataloader, see the PyTorch docs for DataLoader. If left as None, the dict {"shuffle": True} will be used, this is set in the GeneralDataConfig.

None
val_dataloader_params dict

Parameters for the validation dataloader, see PyTorch the docs for DataLoader. If left as None, the empty dict {} will be used, this is set in the GeneralDataConfig.

None
checkpoint_params dict

Parameters for the checkpoint callback, see PyTorch Lightning documentation (ModelCheckpoint) for the list of available parameters.

None

Returns:

Type Description
Configuration

Configuration for training CARE.

Examples:

Minimum example:

>>> config = create_care_configuration(
...     experiment_name="care_experiment",
...     data_type="array",
...     axes="YX",
...     patch_size=[64, 64],
...     batch_size=32,
...     num_epochs=100
... )

You can also limit the number of batches per epoch:

>>> config = create_care_configuration(
...     experiment_name="care_experiment",
...     data_type="array",
...     axes="YX",
...     patch_size=[64, 64],
...     batch_size=32,
...     num_steps=100  # limit to 100 batches per epoch
... )

To disable augmentations, simply set augmentations to an empty list:

>>> config = create_care_configuration(
...     experiment_name="care_experiment",
...     data_type="array",
...     axes="YX",
...     patch_size=[64, 64],
...     batch_size=32,
...     num_epochs=100,
...     augmentations=[]
... )

A list of augmentations can be passed to the augmentations parameter: to replace the default augmentations:

>>> from careamics.config.augmentations import XYFlipConfig
>>> config = create_care_configuration(
...     experiment_name="care_experiment",
...     data_type="array",
...     axes="YX",
...     patch_size=[64, 64],
...     batch_size=32,
...     num_epochs=100,
...     augmentations=[
...         # No rotation and only Y flipping
...         XYFlipConfig(flip_x = False, flip_y = True)
...     ]
... )

If you are training multiple channels they will be trained independently by default, you simply need to specify the number of channels input (and optionally, the number of channels output):

>>> config = create_care_configuration(
...     experiment_name="care_experiment",
...     data_type="array",
...     axes="YXC", # channels must be in the axes
...     patch_size=[64, 64],
...     batch_size=32,
...     num_epochs=100,
...     n_channels_in=3, # number of input channels
...     n_channels_out=1 # if applicable
... )

If instead you want to train multiple channels together, you need to turn off the independent_channels parameter:

>>> config = create_care_configuration(
...     experiment_name="care_experiment",
...     data_type="array",
...     axes="YXC", # channels must be in the axes
...     patch_size=[64, 64],
...     batch_size=32,
...     num_epochs=100,
...     independent_channels=False,
...     n_channels_in=3,
...     n_channels_out=1 # if applicable
... )

If you would like to train on CZI files, use "czi" as data_type and "SCYX" as axes for 2-D or "SCZYX" for 3-D denoising. Note that "SCYX" can also be used for 3-D data but spatial context along the Z dimension will then not be taken into account.

>>> config_2d = create_care_configuration(
...     experiment_name="care_experiment",
...     data_type="czi",
...     axes="SCYX",
...     patch_size=[64, 64],
...     batch_size=32,
...     num_epochs=100,
...     n_channels_in=1,
... )
>>> config_3d = create_care_configuration(
...     experiment_name="care_experiment",
...     data_type="czi",
...     axes="SCZYX",
...     patch_size=[16, 64, 64],
...     batch_size=16,
...     num_epochs=100,
...     n_channels_in=1,
... )

create_hdn_configuration(experiment_name, data_type, axes, patch_size, batch_size, num_epochs=100, num_steps=None, encoder_conv_strides=(2, 2), decoder_conv_strides=(2, 2), multiscale_count=1, z_dims=(128, 128), output_channels=1, encoder_n_filters=32, decoder_n_filters=32, encoder_dropout=0.0, decoder_dropout=0.0, nonlinearity='ReLU', analytical_kl=False, predict_logvar=None, logvar_lowerbound=None, logger='none', trainer_params=None, augmentations=None, train_dataloader_params=None, val_dataloader_params=None)

Create a configuration for training HDN.

If "Z" is present in axes, then patch_size must be a list of length 3, otherwise 2.

If "C" is present in axes, then you need to set n_channels_in to the number of channels. Likewise, if you set the number of channels, then "C" must be present in axes.

To set the number of output channels, use the n_channels_out parameter. If it is not specified, it will be assumed to be equal to n_channels_in.

By default, all channels are trained independently. To train all channels together, set independent_channels to False.

By setting augmentations to None, the default augmentations (flip in X and Y, rotations by 90 degrees in the XY plane) are applied. Rather than the default augmentations, a list of augmentations can be passed to the augmentations parameter. To disable the augmentations, simply pass an empty list.

TODO revisit the necessity of model_params

Parameters:

Name Type Description Default
experiment_name str

Name of the experiment.

required
data_type Literal['array', 'tiff', 'custom']

Type of the data.

required
axes str

Axes of the data (e.g. SYX).

required
patch_size List[int]

Size of the patches along the spatial dimensions (e.g. [64, 64]).

required
batch_size int

Batch size.

required
num_epochs int

Number of epochs to train for. If provided, this will be added to trainer_params.

100
num_steps int

Number of batches in 1 epoch. If provided, this will be added to trainer_params. Translates to limit_train_batches in PyTorch Lightning Trainer. See relevant documentation for more details.

None
encoder_conv_strides tuple[int, ...]

Strides for the encoder convolutional layers, by default (2, 2).

(2, 2)
decoder_conv_strides tuple[int, ...]

Strides for the decoder convolutional layers, by default (2, 2).

(2, 2)
multiscale_count int

Number of scales in the multiscale architecture, by default 1.

1
z_dims tuple[int, ...]

Dimensions of the latent space, by default (128, 128).

(128, 128)
output_channels int

Number of output channels, by default 1.

1
encoder_n_filters int

Number of filters in the encoder, by default 32.

32
decoder_n_filters int

Number of filters in the decoder, by default 32.

32
encoder_dropout float

Dropout rate for the encoder, by default 0.0.

0.0
decoder_dropout float

Dropout rate for the decoder, by default 0.0.

0.0
nonlinearity Literal

Nonlinearity function to use, by default "ReLU".

'ReLU'
analytical_kl bool

Whether to use analytical KL divergence, by default False.

False
predict_logvar Literal[None, 'pixelwise']

Type of log variance prediction, by default None.

None
logvar_lowerbound Union[float, None]

Lower bound for the log variance, by default None.

None
logger Literal['wandb', 'tensorboard', 'none']

Logger to use for training, by default "none".

'none'
trainer_params dict

Parameters for the trainer class, see PyTorch Lightning documentation.

None
augmentations list[XYFlipConfig | XYRandomRotate90Config] | None

List of augmentations to apply, by default None.

None
train_dataloader_params Optional[dict[str, Any]]

Parameters for the training dataloader, by default None.

None
val_dataloader_params Optional[dict[str, Any]]

Parameters for the validation dataloader, by default None.

None

Returns:

Type Description
Configuration

The configuration object for training HDN.

Examples:

Minimum example:

>>> config = create_hdn_configuration(
...     experiment_name="hdn_experiment",
...     data_type="array",
...     axes="YX",
...     patch_size=[64, 64],
...     batch_size=32,
...     num_epochs=100
... )

You can also limit the number of batches per epoch:

>>> config = create_hdn_configuration(
...     experiment_name="hdn_experiment",
...     data_type="array",
...     axes="YX",
...     patch_size=[64, 64],
...     batch_size=32,
...     num_steps=100  # limit to 100 batches per epoch
... )

create_microsplit_configuration(experiment_name, data_type, axes, patch_size, batch_size, lr=0.001, num_epochs=100, num_steps=None, encoder_conv_strides=(2, 2), decoder_conv_strides=(2, 2), encoder_n_filters=32, decoder_n_filters=32, multiscale_count=3, grid_size=32, z_dims=(128, 128), output_channels=1, encoder_dropout=0.1, decoder_dropout=0.1, nonlinearity='ELU', analytical_kl=False, predict_logvar='pixelwise', logvar_lowerbound=-5.0, loss_type='denoisplit_musplit', kl_type='kl_restricted', reconstruction_weight=1.0, kl_weight=1.0, musplit_weight=0.1, denoisplit_weight=0.9, mmse_count=10, optimizer='Adamax', lr_scheduler_patience=30, logger='none', trainer_params=None, augmentations=None, nm_paths=None, data_stats=None, train_dataloader_params=None, val_dataloader_params=None)

Create a configuration for training MicroSplit.

Parameters:

Name Type Description Default
experiment_name str

Name of the experiment.

required
data_type Literal['array', 'tiff', 'custom']

Type of the data.

required
axes str

Axes of the data (e.g. SYX).

required
patch_size Sequence[int]

Size of the patches along the spatial dimensions (e.g. [64, 64]).

required
batch_size int

Batch size.

required
lr float

Learning rate, by default 1e-3.

0.001
num_epochs int

Number of epochs to train for. If provided, this will be added to trainer_params.

100
num_steps int

Number of batches in 1 epoch. If provided, this will be added to trainer_params. Translates to limit_train_batches in PyTorch Lightning Trainer. See relevant documentation for more details.

None
encoder_conv_strides tuple[int, ...]

Strides for the encoder convolutional layers, by default (2, 2).

(2, 2)
decoder_conv_strides tuple[int, ...]

Strides for the decoder convolutional layers, by default (2, 2).

(2, 2)
encoder_n_filters int

Number of filters in the encoder, by default 32.

32
decoder_n_filters int

Number of filters in the decoder, by default 32.

32
multiscale_count int

Number of multiscale levels, by default 3.

3
grid_size int

Size of the grid for multiscale training, by default 32.

32
z_dims tuple[int, ...]

List of latent dims for each hierarchy level in the LVAE, default (128, 128).

(128, 128)
output_channels int

Number of output channels for the model, by default 1.

1
encoder_dropout float

Dropout rate for the encoder, by default 0.0.

0.1
decoder_dropout float

Dropout rate for the decoder, by default 0.0.

0.1
nonlinearity Literal

Nonlinearity to use in the model, by default "ELU".

'ELU'
analytical_kl bool

Whether to use analytical KL divergence, by default False.

False
predict_logvar Literal['pixelwise']

Type of log-variance prediction, by default "pixelwise".

'pixelwise'
logvar_lowerbound float | None

Lower bound for the log variance, by default -5.0.

-5.0
loss_type Literal['musplit', 'denoisplit', 'denoisplit_musplit']

Type of loss function, by default "denoisplit_musplit".

'denoisplit_musplit'
kl_type Literal['kl', 'kl_restricted']

Type of KL divergence, by default "kl_restricted".

'kl_restricted'
reconstruction_weight float

Weight for reconstruction loss, by default 1.0.

1.0
kl_weight float

Weight for KL loss, by default 1.0.

1.0
musplit_weight float

Weight for muSplit loss, by default 0.1.

0.1
denoisplit_weight float

Weight for denoiSplit loss, by default 0.9.

0.9
mmse_count int

Number of MMSE samples to use, by default 10.

10
optimizer Literal['Adam', 'SGD', 'Adamax']

Optimizer to use, by default "Adamax".

'Adamax'
lr_scheduler_patience int

Patience for learning rate scheduler, by default 30.

30
logger Literal['wandb', 'tensorboard', 'none']

Logger to use for training, by default "none".

'none'
trainer_params dict

Parameters for the trainer class, see PyTorch Lightning documentation.

None
augmentations list[Union[XYFlipConfig, XYRandomRotate90Config]] | None

List of augmentations to apply, by default None.

None
nm_paths list[str] | None

Paths to the noise model files, by default None.

None
data_stats tuple[float, float] | None

Data statistics (mean, std), by default None.

None
train_dataloader_params dict[str, Any] | None

Parameters for the training dataloader, by default None.

None
val_dataloader_params dict[str, Any] | None

Parameters for the validation dataloader, by default None.

None

Returns:

Type Description
Configuration

A configuration object for the microsplit algorithm.

Examples:

Minimum example:

>>> config = create_microsplit_configuration(

... experiment_name="microsplit_experiment",

... data_type="array",

... axes="YX",

... patch_size=[64, 64],

... batch_size=32,

... num_epochs=100

... )

You can also limit the number of batches per epoch:

>>> config = create_microsplit_configuration(

... experiment_name="microsplit_experiment",

... data_type="array",

... axes="YX",

... patch_size=[64, 64],

... batch_size=32,

... num_steps=100 # limit to 100 batches per epoch

... )

create_n2n_configuration(experiment_name, data_type, axes, patch_size, batch_size, num_epochs=100, num_steps=None, augmentations=None, independent_channels=True, loss='mae', n_channels_in=None, n_channels_out=None, logger='none', trainer_params=None, model_params=None, optimizer='Adam', optimizer_params=None, lr_scheduler='ReduceLROnPlateau', lr_scheduler_params=None, train_dataloader_params=None, val_dataloader_params=None, checkpoint_params=None)

Create a configuration for training Noise2Noise.

If "Z" is present in axes, then patch_size must be a list of length 3, otherwise 2.

If "C" is present in axes, then you need to set n_channels_in to the number of channels. Likewise, if you set the number of channels, then "C" must be present in axes.

To set the number of output channels, use the n_channels_out parameter. If it is not specified, it will be assumed to be equal to n_channels_in.

By default, all channels are trained together. To train all channels independently, set independent_channels to True.

By setting augmentations to None, the default augmentations (flip in X and Y, rotations by 90 degrees in the XY plane) are applied. Rather than the default augmentations, a list of augmentations can be passed to the augmentations parameter. To disable the augmentations, simply pass an empty list.

Parameters:

Name Type Description Default
experiment_name str

Name of the experiment.

required
data_type Literal['array', 'tiff', 'czi', 'custom']

Type of the data.

required
axes str

Axes of the data (e.g. SYX).

required
patch_size List[int]

Size of the patches along the spatial dimensions (e.g. [64, 64]).

required
batch_size int

Batch size.

required
num_epochs int

Number of epochs to train for. If provided, this will be added to trainer_params.

100
num_steps int

Number of batches in 1 epoch. If provided, this will be added to trainer_params. Translates to limit_train_batches in PyTorch Lightning Trainer. See relevant documentation for more details.

None
augmentations list of augmentations

List of augmentations to apply, either both or one of XYFlipConfig and XYRandomRotate90Config. By default, it applies both XYFlip (on X and Y) and XYRandomRotate90 (in XY) to the images.

None
independent_channels bool

Whether to train all channels independently, by default False.

True
loss Literal['mae', 'mse']

Loss function to use, by default "mae".

'mae'
n_channels_in int or None

Number of channels in.

None
n_channels_out int or None

Number of channels out.

None
logger Literal['wandb', 'tensorboard', 'none']

Logger to use, by default "none".

'none'
trainer_params dict

Parameters for the trainer class, see PyTorch Lightning documentation.

None
model_params dict

UNetModel parameters.

None
optimizer Literal['Adam', 'Adamax', 'SGD']

Optimizer to use.

"Adam"
optimizer_params dict

Parameters for the optimizer, see PyTorch documentation for more details.

None
lr_scheduler Literal['ReduceLROnPlateau', 'StepLR']

Learning rate scheduler to use.

"ReduceLROnPlateau"
lr_scheduler_params dict

Parameters for the learning rate scheduler, see PyTorch documentation for more details.

None
train_dataloader_params dict

Parameters for the training dataloader, see the PyTorch docs for DataLoader. If left as None, the dict {"shuffle": True} will be used, this is set in the GeneralDataConfig.

None
val_dataloader_params dict

Parameters for the validation dataloader, see PyTorch the docs for DataLoader. If left as None, the empty dict {} will be used, this is set in the GeneralDataConfig.

None
checkpoint_params dict

Parameters for the checkpoint callback, see PyTorch Lightning documentation (ModelCheckpoint) for the list of available parameters.

None

Returns:

Type Description
Configuration

Configuration for training Noise2Noise.

Examples:

Minimum example:

>>> config = create_n2n_configuration(
...     experiment_name="n2n_experiment",
...     data_type="array",
...     axes="YX",
...     patch_size=[64, 64],
...     batch_size=32,
...     num_epochs=100
... )

You can also limit the number of batches per epoch:

>>> config = create_n2n_configuration(
...     experiment_name="n2n_experiment",
...     data_type="array",
...     axes="YX",
...     patch_size=[64, 64],
...     batch_size=32,
...     num_steps=100  # limit to 100 batches per epoch
... )

To disable augmentations, simply set augmentations to an empty list:

>>> config = create_n2n_configuration(
...     experiment_name="n2n_experiment",
...     data_type="array",
...     axes="YX",
...     patch_size=[64, 64],
...     batch_size=32,
...     num_epochs=100,
...     augmentations=[]
... )

A list of augmentations can be passed to the augmentations parameter:

>>> from careamics.config.augmentations import XYFlipConfig
>>> config = create_n2n_configuration(
...     experiment_name="n2n_experiment",
...     data_type="array",
...     axes="YX",
...     patch_size=[64, 64],
...     batch_size=32,
...     num_epochs=100,
...     augmentations=[
...         # No rotation and only Y flipping
...         XYFlipConfig(flip_x = False, flip_y = True)
...     ]
... )

If you are training multiple channels they will be trained independently by default, you simply need to specify the number of channels input (and optionally, the number of channels output):

>>> config = create_n2n_configuration(
...     experiment_name="n2n_experiment",
...     data_type="array",
...     axes="YXC", # channels must be in the axes
...     patch_size=[64, 64],
...     batch_size=32,
...     num_epochs=100,
...     n_channels_in=3, # number of input channels
...     n_channels_out=1 # if applicable
... )

If instead you want to train multiple channels together, you need to turn off the independent_channels parameter:

>>> config = create_n2n_configuration(
...     experiment_name="n2n_experiment",
...     data_type="array",
...     axes="YXC", # channels must be in the axes
...     patch_size=[64, 64],
...     batch_size=32,
...     num_epochs=100,
...     independent_channels=False,
...     n_channels_in=3,
...     n_channels_out=1 # if applicable
... )

If you would like to train on CZI files, use "czi" as data_type and "SCYX" as axes for 2-D or "SCZYX" for 3-D denoising. Note that "SCYX" can also be used for 3-D data but spatial context along the Z dimension will then not be taken into account.

>>> config_2d = create_n2n_configuration(
...     experiment_name="n2n_experiment",
...     data_type="czi",
...     axes="SCYX",
...     patch_size=[64, 64],
...     batch_size=32,
...     num_epochs=100,
...     n_channels_in=1,
... )
>>> config_3d = create_n2n_configuration(
...     experiment_name="n2n_experiment",
...     data_type="czi",
...     axes="SCZYX",
...     patch_size=[16, 64, 64],
...     batch_size=16,
...     num_epochs=100,
...     n_channels_in=1,
... )

create_n2v_configuration(experiment_name, data_type, axes, patch_size, batch_size, num_epochs=100, num_steps=None, augmentations=None, independent_channels=True, use_n2v2=False, n_channels=None, roi_size=11, masked_pixel_percentage=0.2, struct_n2v_axis='none', struct_n2v_span=5, trainer_params=None, logger='none', model_params=None, optimizer='Adam', optimizer_params=None, lr_scheduler='ReduceLROnPlateau', lr_scheduler_params=None, train_dataloader_params=None, val_dataloader_params=None, checkpoint_params=None, seed=None)

Create a configuration for training Noise2Void.

N2V uses a UNet model to denoise images in a self-supervised manner. To use its variants structN2V and N2V2, set the struct_n2v_axis and struct_n2v_span (structN2V) parameters, or set use_n2v2 to True (N2V2).

N2V2 modifies the UNet architecture by adding blur pool layers and removes the skip connections, thus removing checkboard artefacts. StructN2V is used when vertical or horizontal correlations are present in the noise; it applies an additional mask to the manipulated pixel neighbors.

If "Z" is present in axes, then patch_size must be a list of length 3, otherwise 2.

If "C" is present in axes, then you need to set n_channels to the number of channels.

By default, all channels are trained independently. To train all channels together, set independent_channels to False.

By default, the augmentations applied are a random flip along X or Y, and a random 90 degrees rotation in the XY plane. Normalization is always applied, as well as the N2V manipulation.

By setting augmentations to None, the default augmentations (flip in X and Y, rotations by 90 degrees in the XY plane) are applied. Rather than the default augmentations, a list of augmentations can be passed to the augmentations parameter. To disable the augmentations, simply pass an empty list.

The roi_size parameter specifies the size of the area around each pixel that will be manipulated by N2V. The masked_pixel_percentage parameter specifies how many pixels per patch will be manipulated.

The parameters of the UNet can be specified in the model_params (passed as a parameter-value dictionary). Note that use_n2v2 and 'n_channels' override the corresponding parameters passed in model_params.

If you pass "horizontal" or "vertical" to struct_n2v_axis, then structN2V mask will be applied to each manipulated pixel.

Parameters:

Name Type Description Default
experiment_name str

Name of the experiment.

required
data_type Literal['array', 'tiff', 'czi', 'custom']

Type of the data.

required
axes str

Axes of the data (e.g. SYX).

required
patch_size List[int]

Size of the patches along the spatial dimensions (e.g. [64, 64]).

required
batch_size int

Batch size.

required
num_epochs int

Number of epochs to train for. If provided, this will be added to trainer_params.

100
num_steps int

Number of batches in 1 epoch. If provided, this will be added to trainer_params. Translates to limit_train_batches in PyTorch Lightning Trainer. See relevant documentation for more details.

None
augmentations list of augmentations

List of augmentations to apply, either both or one of XYFlipConfig and XYRandomRotate90Config. By default, it applies both XYFlip (on X and Y) and XYRandomRotate90 (in XY) to the images.

None
independent_channels bool

Whether to train all channels together, by default True.

True
use_n2v2 bool

Whether to use N2V2, by default False.

False
n_channels int or None

Number of channels (in and out).

None
roi_size int

N2V pixel manipulation area, by default 11.

11
masked_pixel_percentage float

Percentage of pixels masked in each patch, by default 0.2.

0.2
struct_n2v_axis Literal['horizontal', 'vertical', 'none']

Axis along which to apply structN2V mask, by default "none".

'none'
struct_n2v_span int

Span of the structN2V mask, by default 5.

5
trainer_params dict

Parameters for the trainer, see the relevant documentation.

None
logger Literal['wandb', 'tensorboard', 'none']

Logger to use, by default "none".

'none'
model_params dict

UNetModel parameters.

None
optimizer Literal['Adam', 'Adamax', 'SGD']

Optimizer to use.

"Adam"
optimizer_params dict

Parameters for the optimizer, see PyTorch documentation for more details.

None
lr_scheduler Literal['ReduceLROnPlateau', 'StepLR']

Learning rate scheduler to use.

"ReduceLROnPlateau"
lr_scheduler_params dict

Parameters for the learning rate scheduler, see PyTorch documentation for more details.

None
train_dataloader_params dict

Parameters for the training dataloader, see the PyTorch docs for DataLoader. If left as None, the dict {"shuffle": True} will be used, this is set in the GeneralDataConfig.

None
val_dataloader_params dict

Parameters for the validation dataloader, see PyTorch the docs for DataLoader. If left as None, the empty dict {} will be used, this is set in the GeneralDataConfig.

None
checkpoint_params dict

Parameters for the checkpoint callback, see PyTorch Lightning documentation (ModelCheckpoint) for the list of available parameters.

None
seed int or None

Random seed for reproducibility of N2V pixel manipulation, by default None.

None

Returns:

Type Description
Configuration

Configuration for training N2V.

Examples:

Minimum example:

>>> config = create_n2v_configuration(
...     experiment_name="n2v_experiment",
...     data_type="array",
...     axes="YX",
...     patch_size=[64, 64],
...     batch_size=32,
...     num_epochs=100
... )

You can also limit the number of batches per epoch:

>>> config = create_n2v_configuration(
...     experiment_name="n2v_experiment",
...     data_type="array",
...     axes="YX",
...     patch_size=[64, 64],
...     batch_size=32,
...     num_steps=100  # limit to 100 batches per epoch
... )

To disable augmentations, simply set augmentations to an empty list:

>>> config = create_n2v_configuration(
...     experiment_name="n2v_experiment",
...     data_type="array",
...     axes="YX",
...     patch_size=[64, 64],
...     batch_size=32,
...     num_epochs=100,
...     augmentations=[]
... )

A list of augmentations can be passed to the augmentations parameter:

>>> from careamics.config.augmentations import XYFlipConfig
>>> config = create_n2v_configuration(
...     experiment_name="n2v_experiment",
...     data_type="array",
...     axes="YX",
...     patch_size=[64, 64],
...     batch_size=32,
...     num_epochs=100,
...     augmentations=[
...         # No rotation and only Y flipping
...         XYFlipConfig(flip_x = False, flip_y = True)
...     ]
... )

To use N2V2, simply pass the use_n2v2 parameter:

>>> config = create_n2v_configuration(
...     experiment_name="n2v2_experiment",
...     data_type="tiff",
...     axes="YX",
...     patch_size=[64, 64],
...     batch_size=32,
...     num_epochs=100,
...     use_n2v2=True
... )

For structN2V, there are two parameters to set, struct_n2v_axis and struct_n2v_span:

>>> config = create_n2v_configuration(
...     experiment_name="structn2v_experiment",
...     data_type="tiff",
...     axes="YX",
...     patch_size=[64, 64],
...     batch_size=32,
...     num_epochs=100,
...     struct_n2v_axis="horizontal",
...     struct_n2v_span=7
... )

If you are training multiple channels they will be trained independently by default, you simply need to specify the number of channels:

>>> config = create_n2v_configuration(
...     experiment_name="n2v_experiment",
...     data_type="array",
...     axes="YXC",
...     patch_size=[64, 64],
...     batch_size=32,
...     num_epochs=100,
...     n_channels=3
... )

If instead you want to train multiple channels together, you need to turn off the independent_channels parameter:

>>> config = create_n2v_configuration(
...     experiment_name="n2v_experiment",
...     data_type="array",
...     axes="YXC",
...     patch_size=[64, 64],
...     batch_size=32,
...     num_epochs=100,
...     independent_channels=False,
...     n_channels=3
... )

If you would like to train on CZI files, use "czi" as data_type and "SCYX" as axes for 2-D or "SCZYX" for 3-D denoising. Note that "SCYX" can also be used for 3-D data but spatial context along the Z dimension will then not be taken into account.

>>> config_2d = create_n2v_configuration(
...     experiment_name="n2v_experiment",
...     data_type="czi",
...     axes="SCYX",
...     patch_size=[64, 64],
...     batch_size=32,
...     num_epochs=100,
...     n_channels=1,
... )
>>> config_3d = create_n2v_configuration(
...     experiment_name="n2v_experiment",
...     data_type="czi",
...     axes="SCZYX",
...     patch_size=[16, 64, 64],
...     batch_size=16,
...     num_epochs=100,
...     n_channels=1,
... )

create_ng_data_configuration(data_type, axes, patch_size, batch_size, augmentations=None, normalization=None, channels=None, in_memory=None, n_val_patches=8, num_workers=0, train_dataloader_params=None, val_dataloader_params=None, pred_dataloader_params=None, seed=None)

Create a training NGDatasetConfig.

Note that num_workers is applied to all dataloaders unless explicitly overridden in the respective dataloader parameters.

Parameters:

Name Type Description Default
data_type (array, tiff, zarr, czi, custom)

Type of the data.

"array"
axes str

Axes of the data.

required
patch_size list of int

Size of the patches along the spatial dimensions.

required
batch_size int

Batch size.

required
augmentations list of transforms or None

List of transforms to apply. If None, default augmentations are applied (flip in X and Y, rotations by 90 degrees in the XY plane).

None
normalization dict

Normalization configuration dictionary. If None, defaults to mean_std normalization with automatically computed statistics.

None
channels Sequence of int

List of channels to use. If None, all channels are used.

None
in_memory bool

Whether to load all data into memory. This is only supported for 'array', 'tiff' and 'custom' data types. If None, defaults to True for 'array', 'tiff' and custom, and False for 'zarr' and 'czi' data types. Must be True for array.

None
n_val_patches int

The number of patches to set aside for validation during training. This parameter will be ignored if separate validation data is specified for training.

8,
num_workers int

Number of workers for data loading.

0
augmentations list of transforms or None

List of transforms to apply. If None, default augmentations are applied (flip in X and Y, rotations by 90 degrees in the XY plane).

None
train_dataloader_params dict

Parameters for the training dataloader, see PyTorch notes, by default None.

None
val_dataloader_params dict

Parameters for the validation dataloader, see PyTorch notes, by default None.

None
pred_dataloader_params dict

Parameters for the test dataloader, see PyTorch notes, by default None.

None
seed int

Random seed for reproducibility. If None, seed is generated automatically.

None

Returns:

Type Description
NGDataConfig

Next-Generation Data model with the specified parameters.

create_pn2v_configuration(experiment_name, data_type, axes, patch_size, batch_size, nm_path, num_epochs=100, num_steps=None, augmentations=None, independent_channels=True, use_n2v2=False, num_in_channels=1, num_out_channels=100, roi_size=11, masked_pixel_percentage=0.2, struct_n2v_axis='none', struct_n2v_span=5, trainer_params=None, logger='none', model_params=None, optimizer='Adam', optimizer_params=None, lr_scheduler='ReduceLROnPlateau', lr_scheduler_params=None, train_dataloader_params=None, val_dataloader_params=None, checkpoint_params=None, seed=None)

Create a configuration for training Probabilistic Noise2Void (PN2V).

PN2V extends N2V by incorporating a probabilistic noise model to estimate the posterior distibution of each pixel more precisely.

If "Z" is present in axes, then path_size must be a list of length 3, otherwise 2.

If "C" is present in axes, then you need to set num_in_channels to the number of channels.

By default, all channels are trained independently. To train all channels together, set independent_channels to False. When training independently, each input channel will have num_out_channels outputs (default 400). When training together, all input channels will share num_out_channels outputs.

By default, the augmentations applied are a random flip along X or Y, and a random 90 degrees rotation in the XY plane. Normalization is always applied, as well as the N2V manipulation.

By setting augmentations to None, the default augmentations (flip in X and Y, rotations by 90 degrees in the XY plane) are applied. Rather than the default augmentations, a list of augmentations can be passed to the augmentations parameter. To disable the augmentations, simply pass an empty list.

The roi_size parameter specifies the size of the area around each pixel that will be manipulated by N2V. The masked_pixel_percentage parameter specifies how many pixels per patch will be manipulated.

The parameters of the UNet can be specified in the model_params (passed as a parameter-value dictionary). Note that use_n2v2, num_in_channels, and num_out_channels override the corresponding parameters passed in model_params.

If you pass "horizontal" or "vertical" to struct_n2v_axis, then structN2V mask will be applied to each manipulated pixel.

Parameters:

Name Type Description Default
experiment_name str

Name of the experiment.

required
data_type Literal['array', 'tiff', 'czi', 'custom']

Type of the data.

required
axes str

Axes of the data (e.g. SYX).

required
patch_size List[int]

Size of the patches along the spatial dimensions (e.g. [64, 64]).

required
batch_size int

Batch size.

required
nm_path str

Path to the noise model file.

required
num_epochs int

Number of epochs to train for. If provided, this will be added to trainer_params.

100
num_steps int

Number of batches in 1 epoch. If provided, this will be added to trainer_params. Translates to limit_train_batches in PyTorch Lightning Trainer. See relevant documentation for more details.

None
augmentations list of augmentations

List of augmentations to apply, either both or one of XYFlipModel and XYRandomRotate90Model. By default, it applies both XYFlip (on X and Y) and XYRandomRotate90 (in XY) to the images.

None
independent_channels bool

Whether to train all channels independently, by default True. If True, each input channel will correspond to num_out_channels output channels (e.g., 3 input channels with num_out_channels=400 results in 1200 total output channels).

True
use_n2v2 bool

Whether to use N2V2, by default False.

False
num_in_channels int

Number of input channels.

1
num_out_channels int

Number of output channels per input channel when independent_channels is True, or total number of output channels when independent_channels is False.

400
roi_size int

N2V pixel manipulation area, by default 11.

11
masked_pixel_percentage float

Percentage of pixels masked in each patch, by default 0.2.

0.2
struct_n2v_axis Literal['horizontal', 'vertical', 'none']

Axis along which to apply structN2V mask, by default "none".

'none'
struct_n2v_span int

Span of the structN2V mask, by default 5.

5
trainer_params dict

Parameters for the trainer, see the relevant documentation.

None
logger Literal['wandb', 'tensorboard', 'none']

Logger to use, by default "none".

'none'
model_params dict

UNetModel parameters.

None
optimizer Literal['Adam', 'Adamax', 'SGD']

Optimizer to use.

"Adam"
optimizer_params dict

Parameters for the optimizer, see PyTorch documentation for more details.

None
lr_scheduler Literal['ReduceLROnPlateau', 'StepLR']

Learning rate scheduler to use.

"ReduceLROnPlateau"
lr_scheduler_params dict

Parameters for the learning rate scheduler, see PyTorch documentation for more details.

None
train_dataloader_params dict

Parameters for the training dataloader, see the PyTorch docs for DataLoader. If left as None, the dict {"shuffle": True} will be used, this is set in the GeneralDataConfig.

None
val_dataloader_params dict

Parameters for the validation dataloader, see PyTorch the docs for DataLoader. If left as None, the empty dict {} will be used, this is set in the GeneralDataConfig.

None
checkpoint_params dict

Parameters for the checkpoint callback, see PyTorch Lightning documentation (ModelCheckpoint) for the list of available parameters.

None
seed int or None

Random seed for reproducibility of N2V pixel manipulation, by default None.

None

Returns:

Type Description
Configuration

Configuration for training PN2V.

Examples:

Minimum example:

>>> config = create_pn2v_configuration(

... experiment_name="pn2v_experiment",

... data_type="array",

... axes="YX",

... patch_size=[64, 64],

... batch_size=32,

... nm_path="path/to/noise_model.npz",

... num_epochs=100

... )

You can also limit the number of batches per epoch:

>>> config = create_pn2v_configuration(

... experiment_name="pn2v_experiment",

... data_type="array",

... axes="YX",

... patch_size=[64, 64],

... batch_size=32,

... nm_path="path/to/noise_model.npz",

... num_steps=100 # limit to 100 batches per epoch

... )

To disable augmentations, simply set augmentations to an empty list:

>>> config = create_pn2v_configuration(

... experiment_name="pn2v_experiment",

... data_type="array",

... axes="YX",

... patch_size=[64, 64],

... batch_size=32,

... nm_path="path/to/noise_model.npz",

... num_epochs=100,

... augmentations=[]

... )

A list of augmentations can be passed to the augmentations parameter:

>>> from careamics.config.augmentations import XYFlipModel

>>> config = create_pn2v_configuration(

... experiment_name="pn2v_experiment",

... data_type="array",

... axes="YX",

... patch_size=[64, 64],

... batch_size=32,

... nm_path="path/to/noise_model.npz",

... num_epochs=100,

... augmentations=[

... # No rotation and only Y flipping

... XYFlipModel(flip_x = False, flip_y = True)

... ]

... )

To use N2V2, simply pass the use_n2v2 parameter:

>>> config = create_pn2v_configuration(

... experiment_name="pn2v2_experiment",

... data_type="tiff",

... axes="YX",

... patch_size=[64, 64],

... batch_size=32,

... nm_path="path/to/noise_model.npz",

... num_epochs=100,

... use_n2v2=True

... )

For structN2V, there are two parameters to set, struct_n2v_axis and

struct_n2v_span:

>>> config = create_pn2v_configuration(

... experiment_name="structpn2v_experiment",

... data_type="tiff",

... axes="YX",

... patch_size=[64, 64],

... batch_size=32,

... nm_path="path/to/noise_model.npz",

... num_epochs=100,

... struct_n2v_axis="horizontal",

... struct_n2v_span=7

... )

If you are training multiple channels they will be trained independently by

default, you simply need to specify the number of input channels. Each input

channel will correspond to num_out_channels outputs (1200 total for 3

channels with default num_out_channels=400):

>>> config = create_pn2v_configuration(

... experiment_name="pn2v_experiment",

... data_type="array",

... axes="YXC",

... patch_size=[64, 64],

... batch_size=32,

... nm_path="path/to/noise_model.npz",

... num_epochs=100,

... num_in_channels=3

... )

If instead you want to train multiple channels together, you need to turn

off the independent_channels parameter (resulting in 400 total output

channels regardless of the number of input channels):

>>> config = create_pn2v_configuration(

... experiment_name="pn2v_experiment",

... data_type="array",

... axes="YXC",

... patch_size=[64, 64],

... batch_size=32,

... nm_path="path/to/noise_model.npz",

... num_epochs=100,

... independent_channels=False,

... num_in_channels=3

... )

>>> config_2d = create_pn2v_configuration(

... experiment_name="pn2v_experiment",

... data_type="czi",

... axes="SCYX",

... patch_size=[64, 64],

... batch_size=32,

... nm_path="path/to/noise_model.npz",

... num_epochs=100,

... num_in_channels=1,

... )

>>> config_3d = create_pn2v_configuration(

... experiment_name="pn2v_experiment",

... data_type="czi",

... axes="SCZYX",

... patch_size=[16, 64, 64],

... batch_size=16,

... nm_path="path/to/noise_model.npz",

... num_epochs=100,

... num_in_channels=1,

... )