Skip to content

Training CAREamics

After having created a configuration and assembled the training data, you are ready to train CAREamics. The preferred way to train with CAREamics is to create a CAREamist.

from careamics.careamist import CAREamist

careamist = CAREamist(config)  # (1)!
  1. Here the configuration can either be passed as we have seen in the configuration section, or as a path to configuration saved to the disk.

Working directory

By default CAREamics will save the training logging data and the checkpoints in the root directory from which it is called. However, you can pass work_dir to the CAREamist to specify a different directory.

from careamics.careamist import CAREamist

careamist = CAREamist(
    config,
    work_dir="path/to/work_dir",  # (1)!
)
  1. Pass a relative or absolute path.

Disabling progress bar

The PyTorch Lightning progress bar is be verbose, and can be disabled by passing enable_progress_bar=False to the CAREamist.

careamist = CAREamist(config, enable_progress_bar=False)

Training basics

Split train and validation data

Once you have a CAREamist object, you can train CAREamics with the train method. Data is expected to be coherent with the choice in the configuration and the data section.

Without validation, CAREamics will split the data into training and validation internally. The amount of validation data can be set in the configuration.

Training Noise2Void
careamist.train(train_data=train_data)  # (1)!
  1. train_data should be an array, a path to a file, a path to a folder, or a list of array/paths.
Training CARE
careamist.train(
    train_data=train_data,  # (1)!
    train_data_target=train_target,  # (2)!
)
  1. train_data should be an array, a path to a file, a path to a folder, or a list of array/paths.
  2. For CARE and N2N, a target should be provided. It needs to be of the same type as train_data and pairs are formed by matching the order of the data and target lists.

Passing a directory

If you are passing a path to a directory and the data type if tiff (or a custom type with the required reading utilities, see custom data), then all the files with the expected file extension in that directory (including sub-directories) will be used for training.

Passing a dictionary is not compatible with CZI or Zarr data.

With validation data

When passing validation, the only constraint is that the validation data is of the same type as the training data. The amount of validation data is determined by the size of the validation data.

Training Noise2Void with validation
careamist.train(
    train_data=train_data,
    val_data=val_data,  # (1)!
)
  1. Validation is passed to val_data.
Training CARE with validation
careamist.train(
    train_data=train_data,
    train_data_target=train_target,
    val_data=val_data,  # (1)!
    val_data_target=val_target,  # (2)!
)
  1. Validation is passed to val_data.
  2. Target validation should be provided as well.

Custom data

In the data section we have seen two ways of specifying how to load custom data, ReadFuncLoading and ImageStackLoading. Once either of these classes have been defined and instantiated, they can be passed to the CAREamist to train on custom data.

Training on custom data
careamist.train(
    train_data=train_data,
    loading=read_func,  # (1)!
)
  1. Both ReadFuncLoading and ImageStackLoading can be passed to loading.
Training on custom data
careamist.train(
    train_data=train_data,
    train_data_target=train_target,
    loading=read_func,  # (1)!
)
  1. Both ReadFuncLoading and ImageStackLoading can be passed to loading.

Advanced training

Masking

Masking can be used to exclude certain regions from training, for example areas with no signal or with zero values.

CAREamics supports two methods of masking data during training:

  • providing a mask of the training data to define from which region should the training patches be sampled, or
  • built-in filtering functions. See the full tutorial.
Specifying a mask for Noise2Void training
careamist.train(
    train_data=train_data,
    filtering_mask=mask_data,  # (1)!
)
  1. The mask is passed alongside the data.
Specifying a mask for CARE training
careamist.train(
    train_data=train_data,
    train_data_target=train_target,
    filtering_mask=mask_data,  # (1)!
)
  1. The mask is passed alongside the data.

What is masked?

The mask is a binary set of images with the same size as the training data and should have value 1 for pixels that should be included in the training and 0 for pixels that should be excluded.

Passing callbacks

PyTorch Lightning provide different callbacks, and a callback interface, that can be useful to further tune the training process. You can pass callbacks directly upon instantiating the CAREamist object. Two callbacks are already specified in the configuration (ModelCheckpoint and EarlyStopping), but you can also pass additional callbacks.

Passiong callbacks
from careamics.config.factories import create_advanced_care_config

config = create_advanced_care_config(
    experiment_name="care",
    data_type="array",
    axes="YX",
    patch_size=[64, 64],
    batch_size=8,
    num_epochs=30,
    checkpoint_params={
        "save_top_k": 5,
        "monitor": "val_loss",
    },
)
config.training_config.early_stopping_params = {  # (1)!
    "monitor": "val_loss",
    "patience": 10,
    "mode": "min",
}

careamist = CAREamist(
    config,  # (2)!
    callbacks=[
        CustomCallback(),  # (3)!
    ],
)
  1. Early stopping callback is currently not defined via the convenience functions, but it can be accessed directly.
  2. The configuration already contains the ModelCheckpoint and EarlyStopping callbacks.
  3. Any additional callback can be passed via the callbacks argument.

ModelCheckpoint and EarlyStopping

ModelCheckpoint and EarlyStopping are already specified in the configuration and instantiated by the CAREamist. If you want to set their parameters, use the configuration, otherwise an error will be raised.