Average Unpooling

Pytorch 中并没有直接实现 AverageUnpooling 的 layer,但是 pool 操作本身没有参数,因此可以认为是完全针对 Function 的再封装。通过 F.interpolate 操作可以实现类似 AverageUnpooling 的操作。参考 issue

其中,F.interpolate 函数的定义为:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
torch.nn.functional.interpolate(input, size=None, scale_factor=None, mode='nearest', align_corners=None)
'''
Args:
input (Tensor): the input tensor
size (int or Tuple[int] or Tuple[int, int] or Tuple[int, int, int]):
output spatial size.
scale_factor (float or Tuple[float]): multiplier for spatial size. Has to match input size if it is a tuple.
mode (str): algorithm used for upsampling:
``'nearest'`` | ``'linear'`` | ``'bilinear'`` | ``'bicubic'`` |
``'trilinear'`` | ``'area'``. Default: ``'nearest'``
align_corners (bool, optional): Geometrically, we consider the pixels of the
input and output as squares rather than points.
If set to ``True``, the input and output tensors are aligned by the
center points of their corner pixels, preserving the values at the corner pixels.
If set to ``False``, the input and output tensors are aligned by the corner
points of their corner pixels, and the interpolation uses edge value padding
for out-of-boundary values, making this operation *independent* of input size
when :attr:`scale_factor` is kept the same. This only has an effect when :attr:`mode`
is ``'linear'``, ``'bilinear'``, ``'bicubic'`` or ``'trilinear'``.
Default: ``False``

'''

插值方式参数包括 'nearest' | 'linear' | 'bilinear' | 'bicubic' | 'trilinear' | 'area' 六种。对应于AvgPool 方式还原的参数应该使用 area 算法进行插值,实际使用效果为:

1
2
3
4
5
6
7
8
9
>>> data = torch.randn([1,1,8,9])
>>> pooled = torch.nn.functional.avg_pool2d(data,(2,3))
>>> pooled.size()
torch.Size([1, 1, 4, 3])
>>> pooled
tensor([[[[-0.6849, -0.1410, 0.3709],
[ 0.3756, -0.0544, 0.1330],
[-0.1566, -0.0414, 0.1511],
[-0.0715, 0.4222, -0.1394]]]])

首先我们使用 kernel=2*3 的 AvgPool 操作得到池化之后的矩阵

1
2
3
4
5
6
7
8
9
10
11
>>> unpool = torch.nn.functional.interpolate(pooled, size=(8,9))
>>> unpool
tensor([[[[-0.6849, -0.6849, -0.6849, -0.1410, -0.1410, -0.1410,...],
[-0.6849, -0.6849, -0.6849, -0.1410, -0.1410, -0.1410,...],
[ 0.3756, 0.3756, 0.3756, -0.0544, -0.0544, -0.0544,...],
[ 0.3756, 0.3756, 0.3756, -0.0544, -0.0544, -0.0544,...],
[-0.1566, -0.1566, -0.1566, -0.0414, -0.0414, -0.0414,...],
[-0.1566, -0.1566, -0.1566, -0.0414, -0.0414, -0.0414,...],
[-0.0715, -0.0715, -0.0715, 0.4222, 0.4222, 0.4222,...],
[-0.0715, -0.0715, -0.0715, 0.4222, 0.4222, 0.4222,...]
]]])

然后我们使用 interpolate 函数进行了还原,每一个 2*3 大小的块都被填充上相同的值,填充的方式和池化被计算的方式是一样的。至此,我们便完成了池化操作的还原。

在某些情况下,池化操作可能还涉及到 padding,或者 kernel size 无法整除的情况,这时就需要按照上述算法还原之后通过裁剪等操作进行一些后处理,才能完全还原池化前的矩阵形状。

Max Unpooling

Pytorch 官方 doc 中给出了这个 layer 的实现。[see](https://pytorch.org/docs/stable/nn.html?highlight=max unpool#torch.nn.MaxUnpool2d)

1
torch.nn.MaxUnpool2d(kernel_size, stride=None, padding=0)

因为数据损失的问题,MaxPool 操作也无法完全还原,只能保证最大值点被正确填充,而且在进行 pool 操作的时候要求要保留 indices 的数据;其余点会使用零值填充,可以参考官方示例:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
>>> pool = nn.MaxPool2d(2, stride=2, return_indices=True)
>>> unpool = nn.MaxUnpool2d(2, stride=2)
>>> input = torch.tensor([[[[ 1., 2, 3, 4],
[ 5, 6, 7, 8],
[ 9, 10, 11, 12],
[13, 14, 15, 16]]]])
>>> output, indices = pool(input)
>>> unpool(output, indices)
tensor([[[[ 0., 0., 0., 0.],
[ 0., 6., 0., 8.],
[ 0., 0., 0., 0.],
[ 0., 14., 0., 16.]]]])

>>> # specify a different output size than input size
>>> unpool(output, indices, output_size=torch.Size([1, 1, 5, 5]))
tensor([[[[ 0., 0., 0., 0., 0.],
[ 6., 0., 8., 0., 0.],
[ 0., 0., 0., 14., 0.],
[ 16., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.]]]])