Training CAREamics
After having created a configuration and
assembled the training data, you are ready to train CAREamics. The preferred
way to train with CAREamics is to create a CAREamist.
- Here the configuration can either be passed as we have seen in the configuration section, or as a path to configuration saved to the disk.
Working directory
By default CAREamics will save the training logging data and the checkpoints in the
root directory from which it is called. However, you can pass work_dir to the
CAREamist to specify a different directory.
from careamics.careamist import CAREamist
careamist = CAREamist(
config,
work_dir="path/to/work_dir", # (1)!
)
- Pass a relative or absolute path.
Disabling progress bar
The PyTorch Lightning progress bar is be verbose, and can be disabled by passing
enable_progress_bar=False to the CAREamist.
Training basics
Split train and validation data
Once you have a CAREamist object, you can train CAREamics with the train method. Data
is expected to be coherent with the choice in the configuration
and the data section.
Without validation, CAREamics will split the data into training and validation internally. The amount of validation data can be set in the configuration.
train_datashould be an array, a path to a file, a path to a folder, or a list of array/paths.
careamist.train(
train_data=train_data, # (1)!
train_data_target=train_target, # (2)!
)
train_datashould be an array, a path to a file, a path to a folder, or a list of array/paths.- For CARE and N2N, a target should be provided. It needs to be of the same type as
train_dataand pairs are formed by matching the order of the data and target lists.
Passing a directory
If you are passing a path to a directory and the data type if tiff (or a custom
type with the required reading utilities, see custom data), then
all the files with the expected file extension in that directory (including sub-directories) will be used for
training.
Passing a dictionary is not compatible with CZI or Zarr data.
With validation data
When passing validation, the only constraint is that the validation data is of the same type as the training data. The amount of validation data is determined by the size of the validation data.
Custom data
In the data section we have seen two ways of specifying how to load custom
data, ReadFuncLoading and ImageStackLoading. Once either of these classes have been
defined and instantiated, they can be passed to the CAREamist to train on custom data.
- Both
ReadFuncLoadingandImageStackLoadingcan be passed toloading.
Advanced training
Masking
Masking can be used to exclude certain regions from training, for example areas with no signal or with zero values.
CAREamics supports two methods of masking data during training:
- providing a mask of the training data to define from which region should the training patches be sampled, or
- built-in filtering functions. See the full tutorial.
What is masked?
The mask is a binary set of images with the same size as the training data and
should have value 1 for pixels that should be included in the training and 0
for pixels that should be excluded.
Passing callbacks
PyTorch Lightning provide different callbacks, and a callback interface, that can be
useful to further tune the training process. You can pass callbacks directly upon
instantiating the CAREamist object. Two callbacks are already specified in the
configuration (ModelCheckpoint
and EarlyStopping),
but you can also pass additional callbacks.
from careamics.config.factories import create_advanced_care_config
config = create_advanced_care_config(
experiment_name="care",
data_type="array",
axes="YX",
patch_size=[64, 64],
batch_size=8,
num_epochs=30,
checkpoint_params={
"save_top_k": 5,
"monitor": "val_loss",
},
)
config.training_config.early_stopping_params = { # (1)!
"monitor": "val_loss",
"patience": 10,
"mode": "min",
}
careamist = CAREamist(
config, # (2)!
callbacks=[
CustomCallback(), # (3)!
],
)
- Early stopping callback is currently not defined via the convenience functions, but it can be accessed directly.
- The configuration already contains the
ModelCheckpointandEarlyStoppingcallbacks. - Any additional callback can be passed via the
callbacksargument.
ModelCheckpoint and EarlyStopping
ModelCheckpoint and EarlyStopping are already specified in the configuration
and instantiated by the CAREamist. If you want to set their parameters, use
the configuration, otherwise an error will be raised.