[Paper] Image - DWT Coding Algorithm using HVS Characteristics (2016)

Trabajo Inglés
Universidad Universidad Politécnica de Cataluña (UPC)
Grado Ingeniería de Sistemas Audiovisuales - 3º curso
Asignatura Codificación de Contenidos Audiovisuales
Año del apunte 2016
Páginas 4
Fecha de subida 27/12/2016
Descargas 3
Subido por

Vista previa del texto

A WAVELET IMAGE CODING ALGORITHM BASED ON HUMAN VISUAL SYSTEM CHARACTERISTICS Javier de la Rica Polytechnic University of Catalonia, Barcelona ABSTRACT In this paper, an image-coding algorithm based on the discrete wavelet transform and the human visual system characteristics is proposed. After applying the DWT to an input image and dividing the resulting sub-blocks, different perceptual weights based on the activity of each block are applied to each one of them. Finally, the SPIHT algorithm is computed to obtain a bit stream and decode it to obtain the output image.
To follow the system implemented the following blocks diagram is offered; firstly, the DWT is applied to the original input image, obtaining the decomposition in the wavelet domain. The classification follows the DWT, studying those sub-blocks that have more activity, so the system applies a higher weighting factor to those sub-blocks.
The SPIHT algorithm is the one that codes the image into a bit stream, which would be sent, and the receiver would decode using the SPIHT decoder, reconstructing the image with the inverse DWT to obtain the original image.
Keywords— Wavelet Transform DWT; SPIHT; Texture Analysis; Human Visual System, Variance 1. INTRODUCTION Figure 1. Blocks diagram of the whole implemented system.
Image compression is minimizing the size in bytes of a graphics file without degrading the quality of the image to an unacceptable level. The reduction file size allows more images to be stored in a given amount of disk or memory space. It also reduces the time required for images to be sent over the Internet or downloaded from Web pages.
2. DISCRETE WAVELET TRANSFORM ANALYSIS Image files can be compressed in several different ways. For Internet use, the two most common compressed graphic image formats are the JPEG format and the GIF format. Other techniques for image compression include the use of fractals and wavelets. These methods have not gained widespread acceptance for use on the Internet as of this writing. However, both offer promise because they offer higher compression ratios than the JPEG or GIF methods for some types of images. That is why this paper is going to explain and try the Wavelets compression method using the SPIHT algorithm, applying some algorithms to use human visual system characteristics to reconstruct the image that best matches the original one to the human visual system.
Two fundamental components of compression are redundancy and irrelevancy reduction. Redundancy reduction aims at removing duplication from the signal source.
Irrelevancy reduction omits parts of the signal that will not be noticed by the signal receiver.
A discrete wavelet transform (DWT) is any wavelet transform for which the wavelets are discretely sampled. One of its advantages is temporal resolution, as it captures both frequency and location information. It is a separable processing which uses the same amount of input than output samples, using an iterated filter bank with High-Pass, H1, and Low-Pass, H0, filters and a decimation factor of 2, as shown in the following image.
Figure 2. Block diagram representation of one level of DWT-2D.
This is the first step of the implementation, as the input image, for example, the cameraman from the Figure 3, is transformed into the Wavelet domain, obtaining the image in Figure 4, using the biorthogonal 4.4 filters. The transform is made for a single level, although the results could be improved increasing the amount of levels, which will be tested later.
With the Discrete Wavelet Transform Analysis, four kind of sub-images are found; the diagonal, vertical, and horizontal detail images and the approximation image.
3. HUMAN VISUAL SYSTEM Once the Discrete Wavelet Transform Analysis is computed, the different sub-blocks obtained are split and studied depending on their own activity.
Figure 6. Blocks diagram of the human visual system irrelevancy reduction.
The first step is to compute the variance of each sub-block, and compare the absolute value of each coefficient in each sub-block with the variance of the same sub-block. If the absolute value of the coefficient is lower than the variance, it is set to 0, otherwise it is set to 1, obtaining then a binary image.
Figure 3. Input image of the DWT system.
Later on, the sub-blocks are weighted depending on their activity, and therefore, their importance in the human visual system, so those sub-blocks who have more coefficients set to 1 on the previous step, are weighted with a weighting value of 0.52, and those who have less with a weighting value of 0.48.
4. SPIHT ALGORITHM Figure 4. DWT output, decomposition into 1 level, 4 subblocks.
The Set Partitioning in Hierarchical Trees algorithm transmits wavelet transformed image in bit plane order with the most significant bits first. Compression happens when only some of the bit planes are transmitted. SPIHT offers an alternative explanation of the principles of the Embedded Zerotree Wavelet operation, so that the reasons for its excellent performance can be better understood, providing a better performance than EZW.
The SPIHT technique is based on three concepts: a.
Partial ordering of the transformed image elements by magnitude, with transmission of order by a subset partitioning algorithm that is duplicated at the decoder.
b. Ordered bit plane transmission of refinement bits.
c.
Exploitation of the self-similarity of the image wavelet transform across different scales.
4.1. SPIHT Coder Figure 5. High-Pass and Low-Pass biorthogonal filters.
Since the order in which the subsets are tested for significance is important, in a practical implementation the significance information is stored in three ordered lists, called list of insignificant sets (LIST), list of insignificant pixels (LIP) and list of significant pixels (LSP). In all lists each entry is identified by a coordinate (i,j), which in the LIP and LSP represents individual pixels, while in the LIST represents either the set D(i,j) or L(i,j).
A hierarchical quad-tree data structure for the wavelet transformed coefficients is formed. The tree is defined in such a way that each node has either no leaves or four offspring, which are from 2x2 adjacent pixels.
To differentiate between them, a LIS entry is of type A if it represents D(i,j), and of type B if it represents L(i,j). During the sorting pass, the pixels in the LIP –which were insignificant in the previous pass– are tested, and those that become significant are moved to the LSP.
At the end of the decoder, the system obtains a bitstream in which the first value is the input image size, and the rest is the coded image in binary values. This bitstream is the one that would be transferred and received in the decoder, as follows.
LIP is used to analyze the wavelet coefficients respectively to decide the significance of tree root node. LIS checks the coefficient sets which are composed of four wavelet coefficients to estimate the significance of whole coefficient set.
4.2. SPIHT Decoder The SPIHT coder consists in the following four sequential and iterative steps.
To examine the zero-tree the coder receives a bit telling whether the tree is significant or not. If the tree is significant it proceeds to the decomposition, while if the tree is not significant deduces a number of zeros in the current bit plane.
1) Initialization: The LSP is set as an empty list, and the coordinates (i,j) are added to the LIP, and only those with descendants are also added to the LIS, as type A entries.
2) Sorting Pass: During the sorting pass, four binary bits are applied in encoding if four root nodes in LIPare insignificant.
However, only one binary bit is needed to show its unimportance if these nodes are organized as one coefficient set in LIS. Therefore, a small coefficient in LIP leads to a number of binary bits transfer in order to represent its unimportance relative to multi thresholds.
3) Refinement Pass: For each entry (i,j) in the LSP, except those included in the last sorting pass, output the nth most significant bit.
4) Quantization-Step Update: Decrements n by 1 and goes back to step 2 (Sorting Pass).
The SPIHT decoder emulates de coder, as it maintains exactly the same data structure as the coder, using symmetric sorting and refinement steps.
At the end of the decoder, an image in the Wavelet domain is obtained, ready to be reconstructed using the Discrete Wavelet Transform Synthesis to obtain the original image.
5. DISCRETE WAVELET TRANSFORM SYNTHESIS The Discrete Wavelet Transform Synthesis reconstructs the received image from the SPIHT Decoder in the wavelet domain, with the same High-Pass and Low-Pass filters than in the Analysis, this time with an interpolation factor of 2.
The levels used in the synthesis must be exactly the same as in the analysis, as it is a symmetric system with the same amount of filters and interpolation steps.
6. RESULTS To evaluate the performance of the proposed imageencoding algorithm, experiments are conducted to different international standard images, as the Cameraman and Lena, all of 256x256.
For the first image, zerotree method will demonstrate a great performance, although it may depend on the bitrate and DWT levels applied to the algorithm. For instance, using the Cameraman image, with three DWT levels and a bitrate of 0.1 in the SPIHT algorithm, the results in Figure 8 were obtained.
Figure 7. Examples of parent-offspring dependencies in the spatial-orientation tree.
To evaluate the output image obtained, a MOS scale has been used, as it obtains the human user’s view of the quality of the system. It is obtained by averaging the results of a set of standard, subjective tests where a number of viewers rate the image quality from 1 (worst) to 5 (best) of both input and output images.
Asking 95 people the valuation of both input and output images the following values are the results obtained, obtaining a very different MOS value for each image.
Finally, a simple Matlab interface has been programmed in order to observe the different results using different images as the Lena, Fruit or People images, attached in the annex.
7. CONCLUSIONS This paper established an image-coding algorithm, in which the sensitivity of human vision system is considered.
Different perceptual weights are applied into different DWT sub-blocks of the input image.
The algorithm that operates through set partitioning in hierarchical trees (SPIHT) accomplishes completely embedded coding. This algorithm uses the principles of partial ordering by magnitude, set partitioning by significance of magnitudes with respect to a sequence of decreasing thresholds, ordered bit plane transmission and self-similarity across scale in an image wavelet transform.
The SPIHT algorithm obtains a great result with the DWT itself, without the use of HVS characteristics. The increment of the PSNR with the HVS is slightly significant, although the images are quite different.
MOS = 3.76 MOS = 2.02 Figure 8. Input and output images from left to right with their representative MOS percentages.
Furthermore, the following table shows the representation of the PSNR for the system with and without taking into account the HVS characteristics depending on the bitrate applied into the SPIHT coder. It shows a considerable increase of the PSNR the higher the bitrate is, although the PSNR does not differ much between the algorithms with the HVS characteristics than the one that does not consider it.
Figure 9. PSNR values obtained for the system with and without the HVS characteristics depending on the bitrate in the SPIHT algorithm.
The results are surprisingly great, obtaining an intelligible output image with enough PSNR even using a low bitrate, although as said before, the PSNR with HVS is slightly better than without the HVS.
The system could be improved using better activity/texture decisions, as the weighting values were chosen by trial and error. Furthermore, the DWT could be computed with different wavelet filters as Haar filters, although the biorthogonal filters have been used because of the great results obtained.
In the annex, the code of the whole project can be observed, with some explanations of each function in order to better understand each block.
8. REFERENCES [1] Li-Xiong Liu, Wei-Wei Wang, “A Wavelet Image Coding Algorithm based on Human Visual System Characteristics”, International Conference on Wavelet Analysis and Pattern Recognition, Hong-Kong, 30-3 Aug. 2008.
[2] Amir Said, William A. Pearlman, “A new and efficient image codec based on set partitioning in hierarchical trees”, IEEE Transaction on Circuits and Systems for Video Technology, vol.6, pp. 243-250, June 1996.
[3] Yo-Sung Ho, “Zerotree Wavelet Image Coding Based on the Human Visual System Model”, China, 1998.
[4] “Introduction to the Discrete Wavelet Transform”, Image and Video Processing, ETSETB-UPC.
...