Image Quality
The most common image quality evaluator is given by our eyes. This is true also for SR problems: the final purpose still remain to obtain images that are better visible for human eyes, the so called visual loss. We can however provide some mathematical formulas which allows to quantitative evaluate the image quality. In both cases we need to establish a relation between the original image and the produced one. Thus we can formulate a quality score only with a reference image. In SR problems, or more in general in up-sampling problems, we can compare the original HR image with the image obtained by the output of our model. In this way our quality score will be a measure of similarity between the two images.
The simple similarity score can be obtained evaluating the peak-signal-to-noise-ratio (PSNR). This quantity is commonly used to establish the compression lossless of an image and it can be computed as
where max(I)
is the maximum value which can be taken by a pixel in the image (in general it will be 1 or 255 depending on the image format chosen) and MSE is the Mean Square Error (ref. Cost) between the original image and the reconstructed one.
The MSE for an image can be computed as:
where W
, H
are width and height of the two images and I
, K
are the original and reconstructed images, respectively.
In other words the PSNR is the maximum power of the signal over the background noise. It is expressed in decibel (dB) because the image values ranging in a wide interval and the logarithmic function rearrange the domain. Thus we can conclude that high PSNR values are associated to a good reconstruction of the original image.
The PSNR is probably the most common quality score [PSNR_SSIM] but it does not always related to a qualitative visual quality. Despite it is commonly used as loss function for SR models.
Nearest | Bicubic | Lanczos | |
---|---|---|---|
PSNR | 25.118 | 27.254 | 26.566 |
SSIM | 0.847 | 0.894 | 0.871 |
Considering the series of images shown in Fig. -1 we can evaluate the PSNR score starting from a down-sampled image. Taking the down-sampled image obtained with the Lanczos algorithm we can compare the original image with their up-sampled version given by the three methods (ref. Table). As expected, the lowest PSNR value is achieved by the nearest interpolation method while the best performances are obtained by the bicubic algorithm. This confirm the wider use of bicubic method in image processing applications. Moreover we have to take in account that an increment of 0.25 in PSNR value correspond to a visible improvement for human eyes.
A more advanced quality score, commonly used in super resolution image evaluation, is given by the Structural SIMilarity index (SSIM). The SSIM aims to mathematically evaluate the structural similarity between two images taking into account also the visible improvement seen by human eyes. The SSIM function can be expressed as
where N
is the number of arbitrary patches which divide the image 1.
For each patch the SSIM is computed as
where and are the means and variances of the images, respectively, and represents the covariance. The and parameters are fixed to avoid mathematical divergence. Also in this case higher value of SSIM corresponds to high similarity between the original image and the reconstructed one.
Based on the previous equation we can highlight a link with the pooling functions discussed in Pooling. Also in this case, in fact, we works with a window/kernel moved along the image which applies a mathematical function on the underlying pixels. This equivalence suggests an easy implementation of this method with slight modifications of the previous code.
The evaluation of SSIM quality score on the previous up-sampled images (ref. Fig. -1 and Table) confirms the results obtained by the PSNR. Also in this case the worst reconstruction is obtained by the nearest algorithm while the highest SSIM is obtained by the bicubic algorithm. The gap between SSIM values is smaller than PSNR ones but this is due to the different domains of the two functions.
-
Patch dimensions commonly used are
11 x 11
or8 x 8
. ↩