|
|
|
Introduction: the ideal image compression.
Lossy compression common practices.
Most lossy compression concepts are very complicated and involve complex
mathematic transformation. All lossy algorithms work on the same principle:
try and change the original data just engough so the compression is more
effective, but not too much. Despite the complexity of those algorithms,
only a few have been successful in achieving very good compression rates while
keeping a fair representation of the original data at a low cpu usage. But
these algorithms are so complex they are difficult to implement, difficult
to improve, and it's difficult to add new features for the resulting
images.
Lossy compression of sounds use the very clever concept, easy to understand, of
"psycho-acoustic model". This means researches have been done to understand
enough of the human ears, to compute how much modification of a sound sample
can be done without being noticed by a listener. The mathematical
transformations involved here are still complex because of the nature of the
sound, but most people can understand what the "model" is about.
There is no such thing as a "aesthetico-visual model" for lossy compression
of images. The traditional mesurement of image degradation is done by
pixel-to-pixel differences, in order to compute a so-called peak "signal-to-noise"
ratio (PSNR). This is supposed to give an estimate of how much damage has
been done to the original image. The PSNR is mesured in dB (decibell), which
is a mesure on a logarithm scale: each 6dB lost doubles the mean
pixel-to-pixel difference between the two images. A PSNR of 48dB means that
there is a mean difference of 1 on each pixel of a 8bit per pixel image. A
JPEG image of quality 75 has a PSNR of about 33dB compared to its original
image, which is about the same as using 5 or 6 bits per pixel.
Very good quality is usually around 40dB, where degradation are not
visible, which is about the same as encoding each pixel on 6 or 7 bits.
Lossy compression common weaknesses.
However handy for estimating the efficiency of an algorithm, these mesures
are unfortunatly very remotely representative of what a human being will
feel as an aesthetic or annoying alteration of an image. For example, the
JPEG compression is famous for its "wavy" artefacts in color gradients.
Wavelet-based compression programs are more respectful of the gradients, but
they deteriorate the borders of highly contrasted areas of the image.
Fractal-based compression algorithms have a little bit of both problems.
This is why the image compression "gurus" are trying to enhance lossy
compression algorithms by improving their mathematical representations, up
to a point where there is no more relation with the compressed data
representing an actual image.
We think that a simpler algorithm for lossy compression of images would have
many advantages over usual algorithms: easy implementation on almost any
plateforms, valuable added features, more contributions from users for
improvements, more controlable results. The obvious requirement for that
simpler algorithm to be really useful is that it gives similar or better
results than usual algoritms, namely JPEG.
There is one easy model to follow to degrade an image in a not too annoying
manner, though: by making it blurry. The algorithm proposed in this paper
tries to leverage this idea by blurrying the "irrelevant" areas of the
image, while keeping sharp the "relevant" areas.
Staying on the path.
To be really useful, a compression algorithm must be efficient at all
stages. For instance, the JPEG algorithm is using a Huffman lossless
compression over the DCT and quantization steps. The proposed algorithm is only intended
to do a 2D decorelation of the data and does not try to define a way to
compress resulting the data stream. There are many algorithms of lossless
compression of binary data that can be used by applying the stream to an
external program: gzip, bzip2, arithmetic, etc.
The proposed algorithm also tries to address several flaws, besides the visual
degradations, commonly present in most image compression algorithms.
1) Most importantly, the weight (as in kilobytes) of the compressed image
must be mainly dependant on the amount of details present in the original
image, not dependant on the size of the original image.
Although these two values seem to be related, they are not. A simple
experience can be to double the width and height of an image with a fair
pixel-doubling algorithm, such as a bicubic interpolation (not a "smart"
algorithm that tries to invent textures). Since no new detail has been
added to the image, the two images, once compressed, should have roughly the
same weight (depending on how much the compression algorithm "likes" the
interpolation). Most compression algorithms keep the 4:1 ratio between the
two compressed images (except for fractal-based algorithms).
2) The degradation should not create details that need to be compressed: when
an already compressed image is compressed further, the resulting image should
be the same as if it was compressed at the same ratio from the same original
image. Most compression algorithm put some artefacts in the image, creating
fake details that will be considered as actual details if another
compression pass is done, thus degrading the image even more at the second
pass by wasting useful compressed data resources.
3) Also, when one wants to display any subpart of an image, it should not
be necessary to decompress the whole image and crop the wanted portion.
4) Finally, due to the composite nature of images, it should be possible to use
different compression parameters on different areas of an image.
prev
toc
next
last modification: March 2002
(C) Eric GAUDET, 2002
|
|