Logging with WandB and TensorBoard

CAREamics writes a CSV log of the training and validation metrics (see the Logging guide). You can also save these metrics to Weights & Biases or TensorBoard by passing logger="wandb" or logger="tensorboard" to the advanced configuration factory.

This tutorial covers enabling and configuring each backend.

Weights & Biases

WandB provides cloud-based experiment tracking with collaborative features. When enabled, CAREamics logs the full Configuration (via model_dump()) at run initialisation, on top of the metrics that Lightning logs automatically.

Installation

WandB requires the wandb extra:

pip install "careamics[wandb]"

Authentication

The first time you train, WandB will prompt for authentication. You can either run wandb login once in your shell, or set credentials and run metadata through environment variables before instantiating the CAREamist:

Configuring WandB through environment variables

import os

os.environ["WANDB_MODE"] = "offline"  # (1)!
os.environ["WANDB_PROJECT"] = "careamics-experiments"  # (2)!
os.environ["WANDB_ENTITY"] = "your-team"  # (3)!

offline writes runs locally and lets you sync them later with wandb sync. Use disabled to turn WandB off entirely without changing the configuration.
The WandB project under which the run is grouped.
Your WandB username or team name.

A full reference is in the WandB documentation. The API key can be obtained from https://wandb.ai/authorize.

Enabling WandB

Build the configuration with the advanced factory and pass logger="wandb":

Training with WandB enabled

from careamics.careamist import CAREamist
from careamics.config.factories import create_advanced_n2v_config

config = create_advanced_n2v_config(
    experiment_name="n2v_wandb",
    data_type="array",
    axes="YX",
    patch_size=[64, 64],
    batch_size=8,
    num_epochs=2,
    logger="wandb",  # (1)!
)
careamist = CAREamist(config)
careamist.train(train_data=train_data)

WandB is added on top of the CSV logger, not in place of it. The same pattern applies to CARE and N2N through create_advanced_care_config and create_advanced_n2n_config.

TensorBoard

TensorBoard writes events locally, making it well suited to offline and HPC workflows.

Installation

TensorBoard requires the tensorboard extra:

pip install "careamics[tensorboard]"

Enabling TensorBoard

Training with TensorBoard enabled

from careamics.careamist import CAREamist
from careamics.config.factories import create_advanced_n2v_config

config = create_advanced_n2v_config(
    experiment_name="n2v_tb",
    data_type="array",
    axes="YX",
    patch_size=[64, 64],
    batch_size=8,
    num_epochs=2,
    logger="tensorboard",  # (1)!
)
careamist = CAREamist(config)
careamist.train(train_data=train_data)

TensorBoard is added on top of the CSV logger, not in place of it. The same pattern applies to CARE and N2N.

After training, launch the TensorBoard server pointing at the working directory:

tensorboard --logdir <work_dir>/tb_logs

then open http://localhost:6006/ in your browser. A custom port can be set with --port.

Comparing several runs

If each run is given its own work_dir under a common parent, pointing TensorBoard at the parent will show all of them together:

Sweeping a parameter with TensorBoard

from pathlib import Path

from careamics.careamist import CAREamist
from careamics.config.factories import create_advanced_n2v_config

experiments = [
    {"name": "baseline", "patch_size": [64, 64], "batch_size": 8},
    {"name": "large_patches", "patch_size": [128, 128], "batch_size": 4},
]

base_dir = Path("tb_comparison")
for exp in experiments:
    config = create_advanced_n2v_config(
        experiment_name=exp["name"],
        data_type="array",
        axes="YX",
        patch_size=exp["patch_size"],
        batch_size=exp["batch_size"],
        num_epochs=2,
        logger="tensorboard",
    )
    careamist = CAREamist(config, work_dir=base_dir / exp["name"])  # (1)!
    careamist.train(train_data=train_data)

Putting each run under a distinct work_dir is what lets TensorBoard treat them as separate, comparable experiments.

tensorboard --logdir tb_comparison