LVAE

`LadderVAE`

Bases: Module

Constructor.

Parameters:

Name	Type	Description	Default
`input_shape`	`int`	The size of the input image.	required
`output_channels`	`int`	The number of output channels.	required
`multiscale_count`	`int`	The number of scales for multiscale processing.	required
`z_dims`	`list[int]`	The dimensions of the latent space for each layer.	required
`encoder_n_filters`	`int`	The number of filters in the encoder.	required
`decoder_n_filters`	`int`	The number of filters in the decoder.	required
`encoder_conv_strides`	`list[int]`	The strides for the conv layers encoder.	required
`decoder_conv_strides`	`list[int]`	The strides for the conv layers decoder.	required
`encoder_dropout`	`float`	The dropout rate for the encoder.	required
`decoder_dropout`	`float`	The dropout rate for the decoder.	required
`nonlinearity`	`str`	The nonlinearity function to use.	required
`predict_logvar`	`bool`	Whether to predict the log variance.	required
`analytical_kl`	`bool`	Whether to use analytical KL divergence.	required

Raises:

Type	Description
`NotImplementedError`	If only 2D convolutions are supported.

`image_size = input_shape` `instance-attribute`

Input image size. (Z, Y, X) or (Y, X) if the data is 2D.

`bottomup_pass(inp)`

Wrapper of _bottomup_pass().

`create_bottom_up_layers(lowres_separate_branch)`

Method creates the stack of bottom-up layers of the Encoder.

that are used to generate the so-called bu_values.

NOTE: If self._multiscale_count < self.n_layers, then LC is done only in the first self._multiscale_count bottom-up layers (starting from the bottom).

Parameters:

Name	Type	Description	Default
`lowres_separate_branch`	`bool`	Whether the residual block(s) used for encoding the low-res input are shared (`False`) or not (`True`) with the "same-size" residual block(s) in the `BottomUpLayer`'s primary flow.	required

`create_final_topdown_layer(upsample)`

Create the final top-down layer of the Decoder.

NOTE: In this layer, (optional) upsampling is performed by bilinear interpolation instead of transposed convolution (like in other TD layers).

Parameters:

Name	Type	Description	Default
`upsample`	`bool`	Whether to upsample the input of the final top-down layer by bilinear interpolation with `scale_factor=2`.	required

`create_first_bottom_up(init_stride, num_res_blocks=1)`

Method creates the first bottom-up block of the Encoder.

Its role is to perform a first image compression step. It is composed by a sequence of nn.Conv2d + non-linearity + BottomUpDeterministicResBlock (1 or more, default is 1).

Parameters:

Name	Type	Description	Default
`init_stride`	`int`	The stride used by the intial Conv2d block.	required
`num_res_blocks`	`int`	The number of BottomUpDeterministicResBlocks, default is 1.	`1`

`create_top_down_layers()`

Method creates the stack of top-down layers of the Decoder.

In these layer the bu_valuesfrom the Encoder are merged with thep_paramsfrom the previous layer of the Decoder to getq_params. Then, a stochastic layer generates a sample from the latent distribution with parametersq_params. Finally, this sample is fed through a TopDownDeterministicResBlock to compute thep_params` for the layer below.

NOTE 1: The algorithm for generative inference approximately works as follows: - p_params = output of top-down layer above - bu = inferred bottom-up value at this layer - q_params = merge(bu, p_params) - z = stochastic_layer(q_params) - (optional) get and merge skip connection from prev top-down layer - top-down deterministic ResNet

NOTE 2: When doing unconditional generation, bu_value is not available. Hence the merge layer is not used, and z is sampled directly from p_params.

`forward(x)`

Forward pass through the LVAE model.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	The input tensor of shape (B, C, H, W).	required

`get_latent_spatial_size(level_idx)`

Level_idx: 0 is the bottommost layer, the highest resolution one.

`get_padded_size(size)`

Returns the smallest size (H, W) of the image with actual size given as input, such that H and W are powers of 2. :param size: input size, tuple either (N, C, H, W) or (H, W) :return: 2-tuple (H, W)

`reset_for_inference(tile_size=None)`

Should be called if we want to predict for a different input/output size.

`topdown_pass(bu_values=None, n_img_prior=None, constant_layers=None, forced_latent=None, top_down_layers=None, final_top_down_layer=None)`

Method defines the forward pass through the LVAE Decoder, the so-called.

Top-Down pass.

Parameters:

Name	Type	Description	Default
`bu_values`	`Union[Tensor, None]`	Output of the bottom-up pass. It will have values from multiple layers of the ladder.	`None`
`n_img_prior`	`Union[Tensor, None]`	When `bu_values` is `None`, `n_img_prior` indicates the number of images to generate from the prior (so bottom-up pass is not used at all here).	`None`
`constant_layers`	`Union[Iterable[int], None]`	A sequence of indexes associated to the layers in which a single instance's z is copied over the entire batch (bottom-up path is not used, so only prior is used here). Set to `None` to avoid this behaviour.	`None`
`forced_latent`	`Union[list[Tensor], None]`	A list of tensors that are used as fixed latent variables (hence, sampling doesn't take place in this case).	`None`
`top_down_layers`	`Union[ModuleList, None]`	A list of top-down layers to use in the top-down pass. If `None`, the method uses the default layers defined in the constructor.	`None`
`final_top_down_layer`	`Union[Sequential, None]`	The last top-down layer of the top-down pass. If `None`, the method uses the default layers defined in the constructor.	`None`

LVAE

LadderVAE

image_size = input_shape instance-attribute

bottomup_pass(inp)

create_bottom_up_layers(lowres_separate_branch)

create_final_topdown_layer(upsample)

create_first_bottom_up(init_stride, num_res_blocks=1)

create_top_down_layers()

forward(x)

get_latent_spatial_size(level_idx)

get_padded_size(size)

reset_for_inference(tile_size=None)

topdown_pass(bu_values=None, n_img_prior=None, constant_layers=None, forced_latent=None, top_down_layers=None, final_top_down_layer=None)

`LadderVAE`

`image_size = input_shape` `instance-attribute`

`bottomup_pass(inp)`

`create_bottom_up_layers(lowres_separate_branch)`

`create_final_topdown_layer(upsample)`

`create_first_bottom_up(init_stride, num_res_blocks=1)`

`create_top_down_layers()`

`forward(x)`

`get_latent_spatial_size(level_idx)`

`get_padded_size(size)`

`reset_for_inference(tile_size=None)`

`topdown_pass(bu_values=None, n_img_prior=None, constant_layers=None, forced_latent=None, top_down_layers=None, final_top_down_layer=None)`