Configuring CAREamics
To start with CAREamics, we need to create a configuration object that holds most of the useful parameters. The configuration ensures cross-validation and coherence of the parameters, in particular avoiding sets of parameters that could trigger errors deep in the library.
A configuration can be created using any of the algorithm-specific convenience functions below. We provide a simple function with a minimum set of parameters, and an advanced function giving access to many more.
from careamics.config.factories import (
create_n2v_config,
create_care_config,
create_n2n_config,
create_advanced_n2v_config,
create_advanced_care_config,
create_advanced_n2n_config,
)
CARE and Noise2Noise
CARE and Noise2Noise configurations have the exact same set of parameters, contrary
to Noise2Void. In this section and the following, we only show the CARE configuration, but the same
applies to Noise2Noise by simply swapping create_care_config with
create_n2n_config.
Simple configuration
The simple configuration functions are designed to only expose the parameters most commonly used. This is a good starting point for most experiments.
from careamics.config.factories import create_n2v_config
# create a configuration
config = create_n2v_config(
experiment_name="n2v_training",
data_type="array",
axes="ZYX",
patch_size=[16, 64, 64], # (1)!
batch_size=8,
num_epochs=30,
)
- The length of the patch size is conditioned on the presence of the
Zaxis.
from careamics.config.factories import create_care_config
# create a configuration
config = create_care_config(
experiment_name="care_training",
data_type="array",
axes="ZYX",
patch_size=[16, 64, 64], # (1)!
batch_size=8,
num_epochs=30,
)
- The length of the patch size is conditioned on the presence of the
Zaxis.
experiment_name: The experiment name is used in the logging and automatic model saving. It should only contain letters, numbers, underscores, dashes and spaces.data_type: The data type impacts other parameters and which features may be available. CAREamics supportsarray(when passingnumpyarrays directly),tiff,zarr,cziandcustom.axes: The axes of the data, in the order they have on disk (or in memory). This is important to identify correctly the spatial and channel dimensions. Refer to the data section for tips on how to identify axes.patch_size: The size of the patches to extract from the data during training. Note that the patch size only refers to spatial axes (X,Yand optionallyZ). They are even and usually the same forXandYaxes.patch_sizeshould be 2D for axes withoutZ, and 3D for axes withZ.batch_size: The number of patches to use in each training step.num_epochs: The number of epochs to train for. Note that in the case of large datasets, you might want to also set the number of steps parameter.
Training with T as depth axis
If you want to use your T axis as the depth axis, simply relabel it as Z. If you are using data_type="czi", then you can also use
T as a depth axis (axes="SCTYX"), see data section
for more details.
Reducing the number of steps
Each training epoch cycles through all patches. Therefore, for large datasets, an epoch
can be quite long, resulting in low sampling of the performances (train loss, validation loss). In this case, it is useful to set the
number of steps num_steps.
from careamics.config.factories import create_n2v_config
# create a configuration
config = create_n2v_config(
experiment_name="n2v_training",
data_type="array",
axes="YX",
patch_size=[64, 64],
batch_size=8,
num_epochs=30,
num_steps=500, # (1)!
)
- Use a number smaller than the total number of steps (given the batch size), see notes below.
from careamics.config.factories import create_care_config
# create a configuration
config = create_care_config(
experiment_name="care_training",
data_type="array",
axes="YX",
patch_size=[64, 64],
batch_size=8,
num_epochs=30,
num_steps=500, # (1)!
)
- Use a number smaller than the total number of steps (given the batch size), see notes below.
How many steps per epoch?
Each epoch consists of n_patchs / batch_size steps. The total number of steps is
shown in the console during training (here 300 steps):
While there is a programmaticaly way to know how many patches would CAREamics extract from the data, it is easier to simply run a training for an epoch and check the console output.
Advanced num_epochs and num_steps
The num_epochs and num_steps correspond to the max_epochs and
limit_train_batches parameters of the Pytorch Lightning Trainer. Refer to the
Trainer API for
details about these parameters.
Augmentations
CAREamics applies augmentations to the training patches, by default random flips in X or
Y, and random rotations by 90 degrees. In certain cases, these augmentations may not be
desirable, for example when the result of the augmentation is not a possible occurence
in the data. In microscopy, this can happen when there are structures that have always
the same orientation, or noise with a spatial correlation. To control the augmentations, you can
use the augmentations parameter.
from careamics.config.factories import create_n2v_config
# create a configuration
config = create_n2v_config(
experiment_name="n2v_training",
data_type="array",
axes="YX",
patch_size=[64, 64],
batch_size=8,
num_epochs=30,
augmentations=["x_flip", "y_flip", "rotate_90"], # (1)!
)
- These are all the possible choices.
from careamics.config.factories import create_care_config
# create a configuration
config = create_care_config(
experiment_name="care_training",
data_type="array",
axes="YX",
patch_size=[64, 64],
batch_size=8,
num_epochs=30,
augmentations=["x_flip", "y_flip", "rotate_90"], # (1)!
)
- These are all the possible choices.
To disable augmentations, set augmentations=[].
How are augmentations applied?
Each augmentation has a 0.5 probability of being applied to each patch. The XY flip applies flipping in either X or Y direction. The random 90 degree rotations applies either a 90, 180 or 270 rotations (if applied). The augmentations are applied sequentially, such that a patch can be flipped in X and then rotated by 180 degrees.
Channels
Channels are a particular type of axes, and they influence the way the deep-learning
model is built. As a result, when C is present in the axes, additional parameters
need to be set. These parameters vary from algorithm to algorithm.
from careamics.config.factories import create_n2v_config
# create a configuration
config = create_n2v_config(
experiment_name="n2v_training",
data_type="array",
axes="CYX", # (1)!
patch_size=[64, 64],
batch_size=8,
num_epochs=30,
n_channels=3, # (2)!
)
- Channels are considered to be present as soon as
Cis inaxes. - For Noise2Void, the number of input and output channels are the same, so we only
need to set
n_channels.
from careamics.config.factories import create_care_config
# create a configuration
config = create_care_config(
experiment_name="care_training",
data_type="array",
axes="CYX", # (1)!
patch_size=[64, 64],
batch_size=8,
num_epochs=30,
n_channels_in=3, # (2)!
n_channels_out=2,
)
- Channels are considered to be present as soon as
Cis inaxes. - For CARE/N2N, the number of input and output channels can be different, so we
need to set
n_channels_inand optionallyn_channels_out.
Note that if n_channels_out is not set, it will be set to the same value as
n_channels_in.
Advanced channels parameters
The advanced CAREamics configuration gives access to more channel related parameters, such as sub-setting or channel independence during training. Refer to the advanced configuration section for more details.
Number of validation patches
When no validation data is provided, CAREamics will automatically split some patches
from the training data to use as validation. The number of validation patches is set
then governed by the n_val_patches parameter. By default, it is set to 8.
from careamics.config.factories import create_n2v_config
# create a configuration
config = create_n2v_config(
experiment_name="n2v_training",
data_type="array",
axes="YX",
patch_size=[64, 64],
batch_size=8,
num_epochs=30,
n_val_patches=15, # (1)!
)
- Choose an appropriate number of validation patches, depending on the size of the training data, to avoid pulling too many patches from the training data.
from careamics.config.factories import create_care_config
# create a configuration
config = create_care_config(
experiment_name="care_training",
data_type="array",
axes="YX",
patch_size=[64, 64],
batch_size=8,
num_epochs=30,
n_val_patches=15, # (1)!
)
- Choose an appropriate number of validation patches, depending on the size of the training data, to avoid pulling too many patches from the training data, while maintaining meaningful validation.
What happens when validation data is passed?
In the presence of validation data, the n_val_patches parameter is ignored and
the effective number of validation patches is determined by the size of the
validation data.
You can however limit the number of validation steps using PyTorch Lightning parameters, refer to the advanced training parameters section.
Advanced configuration
More parameters are available by using the advanced configuration convenience functions. In this section, we explore these additional parameters.
Training in memory
Where the training data resides influences the speed at which patches can be extracted,
and in turn total training time. The faster way to train is to hold all the data in
memory. However, this is only possible when the data is small enough to fit in the RAM.
Data can be loaded in memory by setting the in_memory parameter to True in the
configuration.
from careamics.config.factories import create_advanced_n2v_config
# create a configuration
config = create_advanced_n2v_config(
experiment_name="adv_n2v_training",
data_type="tiff", # (1)!
axes="YX",
patch_size=[64, 64],
batch_size=8,
num_epochs=30,
in_memory=True,
)
- Only
array,tiffandcustomare compatible with in-memory training.
from careamics.config.factories import create_advanced_care_config
# create a configuration
config = create_advanced_care_config(
experiment_name="adv_care_training",
data_type="tiff", # (1)!
axes="YX",
patch_size=[64, 64],
batch_size=8,
num_epochs=30,
in_memory=True,
)
- Only
array,tiffandcustomare compatible with in-memory training.
data_type and in_memory parameters
Only tiff and custom are compatible with in_memory=True. For array, this is
automatically set to True and cannot be set to False. For czi and zarr,
training is done by using random access to the data on disk and currently in-memory
is not implemented.
For more details on custom data type, refer to the data section.
Subsetting channels
When the data has channels, it is possible to train from a subset of them only by
passing list of channel indices to the channels parameter.
from careamics.config.factories import create_advanced_n2v_config
# create a configuration
config = create_advanced_n2v_config(
experiment_name="adv_n2v_training",
data_type="array",
axes="CYX", # (1)!
patch_size=[64, 64],
batch_size=8,
num_epochs=30,
channels=[0, 2], # (2)!
)
- For channels to be considered present,
Cneeds to be inaxes. - Training would only be performed using two channels, the first and third, since channels are indexed starting from 0.
from careamics.config.factories import create_advanced_care_config
# create a configuration
config = create_advanced_care_config(
experiment_name="adv_care_training",
data_type="array",
axes="CYX", # (1)!
patch_size=[64, 64],
batch_size=8,
num_epochs=30,
channels=[0, 2], # (2)!
)
- For channels to be considered present,
Cneeds to be inaxes. - Training would only be performed using two channels, the first and third, since channels are indexed starting from 0.
Number of channels
In these examples, you might notice that n_channels/n_channels_in are not set,
although they are required when C is in axes. The reason is that when channels
is set, the number of channels is automatically inferred from channels.
In the case of CARE/N2N, if n_channels_out is also set automatically to the size
of channels, but can also be set to a different value.
Channel independence
By default, channels are trained independently. This means that the channels do not inform each other during training. For algorith such as Noise2Void, this may be desirable as the noise is uncorrelated between channels. By default, independent_channels is set to True. To disable channel independence, set independent_channels=False.
from careamics.config.factories import create_advanced_n2v_config
# create a configuration
config = create_advanced_n2v_config(
experiment_name="adv_n2v_training",
data_type="array",
axes="CYX", # (1)!
patch_size=[64, 64],
batch_size=8,
num_epochs=30,
n_channels=3, # (2)!
independent_channels=False, # (3)!
)
Cmust be inaxes.- We need to specify the number of channels.
- Set the channels to inform each other.
from careamics.config.factories import create_advanced_care_config
# create a configuration
config = create_advanced_care_config(
experiment_name="adv_care_training",
data_type="array",
axes="CYX", # (1)!
patch_size=[64, 64],
batch_size=8,
num_epochs=30,
n_channels_in=3, # (2)!
independent_channels=False, # (3)!
)
Cmust be inaxes.- We need to specify the number of channels. You may also set
n_channels_outto a different value if you want the output channels to be different from the input channels. - Set the channels to inform each other.
What does channel independence mean?
In effect, training the channels independently is equivalent to training a separate model for each channel.
Normalization
CAREamics offers various normalization methods that can be set using the normalization
and normalization_params parameters. The normalization is applied to any patch or
image before applying the model, therefore it is applied in both training and prediction.
For more details on the available normalization methods and their parameters, refer to the
Code reference section.
The various normalizations are the following:
| Method | Optional parameters |
|---|---|
mean_std |
per_channel=True, input_means, input_stds,target_means, target_stds |
quantile |
per_channel=True, lower_quantile=[0.01], upper_quantile=[0.99] , input_lower_quantile_values, input_upper_quantile_values, target_lower_quantile_values, target_upper_quantile_values |
minmax |
per_channel=True, input_mins, input_maxes , target_mins, target_maxes |
none |
No parameters |
Noise2Void and targets
While normalization methods have parameters for the targets, there is no target in Noise2Void and these parameters will not be used.
The quantile normalization has two parameters, lower_quantile and upper_quantile, that are set by default to 0.01 and 0.99 respectively. These parameters can be changed by passing them in normalization_params as dictionary. By default, mean_std normalization is applied, which correspond to zero-mean, unit-variance normalization. To disable normalization, set normalization="none"
from careamics.config.factories import create_advanced_n2v_config
# create a configuration
config = create_advanced_n2v_config(
experiment_name="adv_n2v_training",
data_type="array",
axes="YX",
patch_size=[64, 64],
batch_size=8,
num_epochs=30,
normalization="quantile", # (1)!
normalization_params={"lower_quantiles": [0.01], "upper_quantiles": [0.99]}, # (2)!
)
- The normalization method is choosen by passing
normalization. - It is possible to change default parameters (here for
quantile). Alternatively, pre-computed. normalization parameters can be passed here.
from careamics.config.factories import create_advanced_care_config
# create a configuration
config = create_advanced_care_config(
experiment_name="adv_care_training",
data_type="array",
axes="YX",
patch_size=[64, 64],
batch_size=8,
num_epochs=30,
normalization="quantile", # (1)!
normalization_params={"lower_quantiles": [0.01], "upper_quantiles": [0.99]}, # (2)!
)
- The normalization method is choosen by passing
normalization. - It is possible to change default parameters (here for
quantile). Alternatively, pre-computed. normalization parameters can be passed here.
Passing pre-computed normalization parameters
If the normalization parameters (e.g. input_means and input_stds) are None in the configuration, then the dataset will compute them over the entire training dataset.
However, if you have pre-computed normalization parameters, you can pass them in the configuration using the normalization_params parameter. This can save time during training, as the dataset will not need to compute them.
from careamics.config.factories import create_advanced_n2v_config
# create a configuration
config = create_advanced_n2v_config(
experiment_name="adv_n2v_training",
data_type="array",
axes="YX",
patch_size=[64, 64],
batch_size=8,
num_epochs=30,
normalization="mean_std", # (1)!
normalization_params={"input_means": [158.95], "input_stds": [10.21]}, # (2)!
)
- Here we are selecting the zero-mean unit-variance normalization, which is the default.
- We pass pre-computed parameters.
from careamics.config.factories import create_advanced_care_config
# create a configuration
config = create_advanced_care_config(
experiment_name="adv_care_training",
data_type="array",
axes="YX",
patch_size=[64, 64],
batch_size=8,
num_epochs=30,
normalization="mean_std", # (1)!
normalization_params={
"input_means": [158.95], # (2)!
"input_stds": [10.21],
"target_means": [121.35], # (3)!
"target_stds": [15.78],
},
)
- Here we are selecting the zero-mean unit-variance normalization, which is the default.
- We pass pre-computed parameters.
- In CARE and N2N, we can also pass pre-computed target statistics. For N2N they should actually be the same as the input ones.
In the presence of channels, all parameters that are passed to normalization_params should have a value for each channel, unless per_channel is set to False. By default, per_channel is set to True.
from careamics.config.factories import create_advanced_n2v_config
# create a configuration
config = create_advanced_n2v_config(
experiment_name="adv_n2v_training",
data_type="array",
axes="CYX", # (1)!
patch_size=[64, 64],
batch_size=8,
num_epochs=30,
n_channels=2, # (2)!
normalization="quantile",
normalization_params={
"lower_quantiles": [0.01, 0.03], # (3)!
"upper_quantiles": [0.99, 0.99],
"per_channel": True, # (4)!
},
)
- We have channels.
- We need to set the number of channels.
- Here we have to pass as many values as there are channels for each parameter.
- Note that if we set
per_channeltoFalse, a single value is expected in the other parameters.
from careamics.config.factories import create_advanced_care_config
# create a configuration
config = create_advanced_care_config(
experiment_name="adv_care_training",
data_type="array",
axes="CYX", # (1)!
patch_size=[64, 64],
batch_size=8,
num_epochs=30,
n_channels_in=2, # (2)!
normalization="quantile",
normalization_params={
"lower_quantiles": [0.01, 0.03], # (3)!
"upper_quantiles": [0.99, 0.99],
"per_channel": True, # (4)!
},
)
- We have channels.
- We need to set the number of channels.
- Here we have to pass as many values as there are channels for each parameter.
- Note that if we set
per_channeltoFalse, a single value is expected in the other parameters.
Choosing a logger
By default, CAREamics uses the CSV logger from PyTorch Lightning, saving all the loss and metrics to a csv file. In addition, we can use more advanced logging tools, such as WandB or Tensorboard. To use these loggers, simply set the logger parameter to "wandb" or "tensorboard".
from careamics.config.factories import create_advanced_n2v_config
# create a configuration
config = create_advanced_n2v_config(
experiment_name="adv_n2v_training",
data_type="array",
axes="YX",
patch_size=[64, 64],
batch_size=8,
num_epochs=30,
logger="wandb", # (1)!
)
- We choose WandB in addition to the CSV logger.
from careamics.config.factories import create_advanced_care_config
# create a configuration
config = create_advanced_care_config(
experiment_name="adv_care_training",
data_type="array",
axes="YX",
patch_size=[64, 64],
batch_size=8,
num_epochs=30,
logger="wandb", # (1)!
)
- We choose WandB in addition to the CSV logger.
Number of workers
The num_workers parameter controls the number of workers used to load the data during training. It can be used to optimize data loading performance.
from careamics.config.factories import create_advanced_n2v_config
# create a configuration
config = create_advanced_n2v_config(
experiment_name="adv_n2v_training",
data_type="array",
axes="YX",
patch_size=[64, 64],
batch_size=8,
num_epochs=30,
num_workers=4, # (1)!
)
- We set the number of workers for data loading.
from careamics.config.factories import create_advanced_care_config
# create a configuration
config = create_advanced_care_config(
experiment_name="adv_care_training",
data_type="array",
axes="YX",
patch_size=[64, 64],
batch_size=8,
num_epochs=30,
num_workers=4, # (1)!
)
- We set the number of workers for data loading.
Which value to choose?
A general rule of thumb is to set the number of workers to the number of CPU cores available. num_workers=0 means that the data loading will be done in the main process, which can be a bottleneck but can also be necessary in certain environments (e.g. Windows without the possibility to use multi-processing).
Setting a seed
Setting a seed allows fixing the series of random choices happening during training, and would allow to reproduce the same training run. To set a seed, simply set the seed parameter to an integer value.
Noise2Void flavours and parameters
Noise2Void have additional parameters that in some cases can make a difference. Furthermore, N2V2 and structN2V are two compatible variants of Noise2Void that are adressing particular short-comings and limitations.
Running N2V2
N2V2 is a variant of Noise2Void that mitigates checkerboard artefacts that arise in Noise2Void (e.g. in the presence of salt and pepper noise, or hot pixels). N2V2 can be enabled by simply setting n2v2=True in the configuration.
from careamics.config.factories import create_n2v_config
# create a configuration
config = create_n2v_config(
experiment_name="n2v2_training",
data_type="array",
axes="YX",
patch_size=[64, 64],
batch_size=8,
num_epochs=30,
use_n2v2=True,
)
structN2V
structN2V is a variant of Noise2Void that is designed to better handle structured noise, such as line artefacts. It does so by masking out a line of pixels instead of a single pixel during training. To use structN2V, set the struct_n2v_axis and struct_n2v_span parameters.
from careamics.config.factories import create_advanced_n2v_config
# create a configuration
config = create_advanced_n2v_config(
experiment_name="struct_n2v_training",
data_type="array",
axes="YX",
patch_size=[64, 64],
batch_size=8,
num_epochs=30,
struct_n2v_axis="horizontal", # (1)!
struct_n2v_span=5, # (2)!
)
- Choices are
horizontalorvertical. - The number of pixels to mask out on each side of the pixel masked by Noise2Void.
Noise2Void parameters
Noise2Void has two parameters that can be set in the configuration: roi_size and masked_pixel_percentage. A good understanding of the Noise2Void algorithm is necessary to understand the effect of these parameters. Refer to the Noise2Void algorithm section for more details.
from careamics.config.factories import create_advanced_n2v_config
# create a configuration
config = create_advanced_n2v_config(
experiment_name="n2v_training",
data_type="array",
axes="YX",
patch_size=[64, 64],
batch_size=8,
num_epochs=30,
roi_size=13, # (1)!
masked_pixel_percentage=0.2, # (2)!
)
- The roi size is the region around the masked pixels from which replacement values are pulled.
- The percentage of pixels to mask in each patch during training.
PyTorch and Lightning Parameters
Since CAREamics uses PyTorch Lightning under the hood, many of the parameters that can be used to tune the behaviour of the training can be set though our API.
Model parameters
While not directly Lighning parameters, the parameters of the UNet model used by CAREamics can be passed to the configuration via model_params. Refer to the Code reference for more details about which parameters are available.
from careamics.config.factories import create_advanced_care_config
# create a configuration
config = create_advanced_care_config(
experiment_name="adv_care_training",
data_type="array",
axes="YX",
patch_size=[64, 64],
batch_size=8,
num_epochs=30,
model_params={
"depth": 4,
},
)
Some parameters of the model are set automatically based on the parameters given to the configuration. Here is a list of parameters that can only be set via model_params:
- depth: number of levels in the UNet, by default 2.
- num_channels_init: number of convolutional filters in the first layer of the UNet, by default 32.
- residual: whether to add a residual connection from the input to the output, by default False.
- use_batch_norm: whether to use batch normalization in the model.
Trainer parameters
The trainer parameters allow setting complex behaviours during training, from when to stop to advanced gradient manipulation. Refer to the Trainer documentation for a list of parameters, their meaning and accepted values.
from careamics.config.factories import create_advanced_n2v_config
# create a configuration
config = create_advanced_n2v_config(
experiment_name="adv_n2v_training",
data_type="array",
axes="YX",
patch_size=[64, 64],
batch_size=8,
num_epochs=30,
trainer_params={}, # (1)!
)
- Pass parameters as a dictionary to
trainer_params.
from careamics.config.factories import create_advanced_care_config
# create a configuration
config = create_advanced_care_config(
experiment_name="adv_care_training",
data_type="array",
axes="YX",
patch_size=[64, 64],
batch_size=8,
num_epochs=30,
trainer_params={}, # (1)!
)
- Pass parameters as a dictionary to
trainer_params.
max_epochs and limit_train_batches parameters
The CAREamics configuration defines num_epochs and num_steps parameters that are passed to the Trainer as max_epochs and limit_train_batches. If max_epochs and limit_train_batches are passed to trainer_params, their values will be overwritten by num_epochs and num_steps.
Optimizer parameters
The optimizer governs how the model parameters are updated during training. By default, CAREamics uses the Adam optimizer. To change the optimizer or its parameters, use the optimizer and optimizer_params parameters in the configuration. Refer to the PyTorch optimizer page for a list of optimizers and their parameters.
from careamics.config.factories import create_advanced_n2v_config
# create a configuration
config = create_advanced_n2v_config(
experiment_name="adv_n2v_training",
data_type="array",
axes="YX",
patch_size=[64, 64],
batch_size=8,
num_epochs=30,
optimizer="Adam", # (1)!
optimizer_params={ # (2)!
"lr": 1e-4,
},
)
- Choose the optimizer.
- Pass parameters as a dictionary to
optimizer_params.
from careamics.config.factories import create_advanced_care_config
# create a configuration
config = create_advanced_care_config(
experiment_name="adv_care_training",
data_type="array",
axes="YX",
patch_size=[64, 64],
batch_size=8,
num_epochs=30,
optimizer="Adam", # (1)!
optimizer_params={ # (2)!
"lr": 1e-4,
},
)
- Choose the optimizer.
- Pass parameters as a dictionary to
optimizer_params.
Supported optimizers
Note that CAREamics currently only support Adam, SGD and Adamax.
Learning rate schedulers
Learning rate schedulers allow changing the learning rate during training, which can be useful to improve training performance. To use a learning rate scheduler, set the lr_scheduler and lr_scheduler_params parameters in the configuration. Refer to the PyTorch learning rate scheduler page for a list of learning rate schedulers and their parameters.
from careamics.config.factories import create_advanced_n2v_config
# create a configuration
config = create_advanced_n2v_config(
experiment_name="adv_n2v_training",
data_type="array",
axes="YX",
patch_size=[64, 64],
batch_size=8,
num_epochs=30,
lr_scheduler="StepLR", # (1)!
lr_scheduler_params={ # (2)!
"step_size": 10,
"gamma": 0.1,
},
)
- Choose the learning rate scheduler.
- Pass parameters as a dictionary to
lr_scheduler_params.
from careamics.config.factories import create_advanced_care_config
# create a configuration
config = create_advanced_care_config(
experiment_name="adv_care_training",
data_type="array",
axes="YX",
patch_size=[64, 64],
batch_size=8,
num_epochs=30,
lr_scheduler="StepLR", # (1)!
lr_scheduler_params={ # (2)!
"step_size": 10,
"gamma": 0.1,
},
)
- Choose the learning rate scheduler.
- Pass parameters as a dictionary to
lr_scheduler_params.
Supported learning rate schedulers
Note that CAREamics currently only support ReduceLROnPlateau and StepLR.
Dataloader parameters
PyTorch dataloaders have various parameters that can be set to optimize data loading performance. To set these parameters, use the train_dataloader_params or val_dataloader_params parameters in the configuration. Refer to the PyTorch dataloader page for a list of dataloader parameters and their meaning.
from careamics.config.factories import create_advanced_n2v_config
# create a configuration
config = create_advanced_n2v_config(
experiment_name="adv_n2v_training",
data_type="array",
axes="YX",
patch_size=[64, 64],
batch_size=8,
num_epochs=30,
train_dataloader_params={
"shuffle": True,
"drop_last": True,
},
)
from careamics.config.factories import create_advanced_care_config
# create a configuration
config = create_advanced_care_config(
experiment_name="adv_care_training",
data_type="array",
axes="YX",
patch_size=[64, 64],
batch_size=8,
num_epochs=30,
train_dataloader_params={
"shuffle": True,
"drop_last": True,
},
)
shuffle in train_dataloader_params
If passing train_dataloader_params, then shuffle needs to be present. That is not the case for the validation dataloader. The reason is that CAREamics automatically uses shuffle=True (the advised setting) for the training dataloader. If you wish to override the CAREamics training dataloader parameters, you need to specify which shuffle value you desire.
Checkpoint callback
A checkpoint is the state of the training and of the model at a particular time point during training. In particular, the final model is also saved as a checkpoint. There are several Lightning checkpoints parameters that govern the behavior of the checkpointing.
If you want to override the CAREamics defaults, set checkpoint_params.
from careamics.config.factories import create_advanced_n2v_config
# create a configuration
config = create_advanced_n2v_config(
experiment_name="adv_n2v_training",
data_type="array",
axes="YX",
patch_size=[64, 64],
batch_size=8,
num_epochs=30,
checkpoint_params={
"monitor": "val_loss",
"mode": "min",
"save_top_k": 3,
},
)
from careamics.config.factories import create_advanced_care_config
# create a configuration
config = create_advanced_care_config(
experiment_name="adv_care_training",
data_type="array",
axes="YX",
patch_size=[64, 64],
batch_size=8,
num_epochs=30,
checkpoint_params={
"monitor": "val_loss",
"mode": "min",
"save_top_k": 3,
},
)
Default checkpointing behavior
By default, CARE saves the top 3 best models according to the validation loss, and the last model at the end of training.
Noise2Void, on the other hand, saves a checkpoint every 10 epochs, as well as the last one. The reason behind this choice is that Noise2Void trains by comparing predicted denoised pixels with their original noisy values. As a consequence, a lower validation loss does not ensure a better model.
Early stopping callback
The early stopping callback allows stopping the training when a certain metric has not improved for a certain number of epochs. To set the early stopping parameters, use the early_stopping_params parameter in the configuration. Refer to the Lightning early stopping callback page for a list of parameters and their meaning.
from careamics.config.factories import create_advanced_care_config
# create a configuration
config = create_advanced_care_config(
experiment_name="adv_care_training",
data_type="array",
axes="YX",
patch_size=[64, 64],
batch_size=8,
num_epochs=30,
early_stopping_params={
"monitor": "val_loss",
"mode": "min",
"patience": 5,
},
)
Early stopping and Noise2Void
Early stopping is not recommended for Noise2Void, as the validation loss does not necessarily reflect the quality of the model. By default, the early stopping is disabled for Noise2Void.
Noise2Void without validation
Since validation is not strictly necessary for Noise2Void, it is possible to train without validation data and without automatic splitting of the training data, thus making the overall training faster. To do so, a few parameters need to be set in the configuration.
from careamics.config.factories import create_advanced_n2v_config
# create a configuration
config = create_advanced_n2v_config(
experiment_name="adv_n2v_training",
data_type="array",
axes="YX",
patch_size=[64, 64],
batch_size=8,
num_epochs=30,
n_val_patches=0, # (1)!
monitor_metric="train_loss_epoch", # (2)!
)
- We tell the trian/val splitting module to split
0validation patches. - We set the monitoring of the learning rate scheduler to
train_loss_epochto avoid an error.
Removing validation in other algorithms
Do not remove validation for CARE! In supervised training, validation is critical to assess whether the network has trained meaningfully.
Saving and loading
Configurations are automatically saved with the checkpoints, but we can nonetheless manually save and load them.
from careamics.config.factories import create_n2v_config
from careamics.config.utils.configuration_io import (
save_configuration,
load_configuration,
)
# create a configuration
config = create_n2v_config(
experiment_name="n2v",
data_type="tiff",
axes="YX",
patch_size=[64, 64],
batch_size=8,
num_epochs=30,
)
# save the configuration
config_path = save_configuration(config, "careamics_config.yml")
# load the configuration
loaded_config = load_configuration(config_path)