PyTorch 中的 Unpooling 操作

Average Unpooling

Pytorch 中并没有直接实现 AverageUnpooling 的 layer，但是 pool 操作本身没有参数，因此可以认为是完全针对 Function 的再封装。通过 F.interpolate 操作可以实现类似 AverageUnpooling 的操作。参考 issue

其中，F.interpolate 函数的定义为：

torch.nn.functional.interpolate(input, size=None, scale_factor=None, mode='nearest', align_corners=None)
'''
Args:
  input (Tensor): the input tensor
  size (int or Tuple[int] or Tuple[int, int] or Tuple[int, int, int]):
    output spatial size.
  scale_factor (float or Tuple[float]): multiplier for spatial size. Has to match input size if it is a tuple.
  mode (str): algorithm used for upsampling:
    ``'nearest'`` | ``'linear'`` | ``'bilinear'`` | ``'bicubic'`` |
    ``'trilinear'`` | ``'area'``. Default: ``'nearest'``
  align_corners (bool, optional): Geometrically, we consider the pixels of the
    input and output as squares rather than points.
    If set to ``True``, the input and output tensors are aligned by the
    center points of their corner pixels, preserving the values at the corner pixels.
    If set to ``False``, the input and output tensors are aligned by the corner
    points of their corner pixels, and the interpolation uses edge value padding
    for out-of-boundary values, making this operation *independent* of input size
    when :attr:`scale_factor` is kept the same. This only has an effect when :attr:`mode`
    is ``'linear'``, ``'bilinear'``, ``'bicubic'`` or ``'trilinear'``.
    Default: ``False``

'''

>>> data = torch.randn([1,1,8,9])
>>> pooled = torch.nn.functional.avg_pool2d(data,(2,3))
>>> pooled.size()
torch.Size([1, 1, 4, 3])
>>> pooled
tensor([[[[-0.6849, -0.1410,  0.3709],
          [ 0.3756, -0.0544,  0.1330],
          [-0.1566, -0.0414,  0.1511],
          [-0.0715,  0.4222, -0.1394]]]])

首先我们使用 kernel=2*3 的 AvgPool 操作得到池化之后的矩阵

>>> unpool = torch.nn.functional.interpolate(pooled, size=(8,9))
>>> unpool
tensor([[[[-0.6849, -0.6849, -0.6849, -0.1410, -0.1410, -0.1410,...],
          [-0.6849, -0.6849, -0.6849, -0.1410, -0.1410, -0.1410,...],
          [ 0.3756,  0.3756,  0.3756, -0.0544, -0.0544, -0.0544,...],
          [ 0.3756,  0.3756,  0.3756, -0.0544, -0.0544, -0.0544,...],
          [-0.1566, -0.1566, -0.1566, -0.0414, -0.0414, -0.0414,...],
          [-0.1566, -0.1566, -0.1566, -0.0414, -0.0414, -0.0414,...],
          [-0.0715, -0.0715, -0.0715,  0.4222,  0.4222,  0.4222,...],
          [-0.0715, -0.0715, -0.0715,  0.4222,  0.4222,  0.4222,...]
          ]]])

然后我们使用 interpolate 函数进行了还原，每一个 2*3 大小的块都被填充上相同的值，填充的方式和池化被计算的方式是一样的。至此，我们便完成了池化操作的还原。

在某些情况下，池化操作可能还涉及到 padding，或者 kernel size 无法整除的情况，这时就需要按照上述算法还原之后通过裁剪等操作进行一些后处理，才能完全还原池化前的矩阵形状。

Max Unpooling

Pytorch 官方 doc 中给出了这个 layer 的实现。[see]([https://pytorch.org/docs/stable/nn.html?highlight=max unpool#torch.nn.MaxUnpool2d](https://pytorch.org/docs/stable/nn.html?highlight=max unpool#torch.nn.MaxUnpool2d))

1	torch.nn.MaxUnpool2d(kernel_size, stride=None, padding=0)

因为数据损失的问题，MaxPool 操作也无法完全还原，只能保证最大值点被正确填充，而且在进行 pool 操作的时候要求要保留 indices 的数据；其余点会使用零值填充，可以参考官方示例：

>>> pool = nn.MaxPool2d(2, stride=2, return_indices=True)
>>> unpool = nn.MaxUnpool2d(2, stride=2)
>>> input = torch.tensor([[[[ 1.,  2,  3,  4],
                            [ 5,  6,  7,  8],
                            [ 9, 10, 11, 12],
                            [13, 14, 15, 16]]]])
>>> output, indices = pool(input)
>>> unpool(output, indices)
tensor([[[[  0.,   0.,   0.,   0.],
          [  0.,   6.,   0.,   8.],
          [  0.,   0.,   0.,   0.],
          [  0.,  14.,   0.,  16.]]]])

>>> # specify a different output size than input size
>>> unpool(output, indices, output_size=torch.Size([1, 1, 5, 5]))
tensor([[[[  0.,   0.,   0.,   0.,   0.],
          [  6.,   0.,   8.,   0.,   0.],
          [  0.,   0.,   0.,  14.,   0.],
          [ 16.,   0.,   0.,   0.,   0.],
          [  0.,   0.,   0.,   0.,   0.]]]])