67
Fifth International Students Conference on Informatics
–
ICDD 2015
May 21-23, 2015, Sibiu, Romania
Picture 0
1
2
3
4
5
6
7
8
9
Output 0
1
2
8
4
5
6
1
8
9
5 Conclusion
Projections on other axis are easy to implement. It is possible that better results could be produced
if we do projection on some other axis, y=-x for example or with combination of some other
projections. In future work we will test those options. Some parameters of histogram had more
influence on learning process than other. If we add weight coefficients to the parameters, it is very
likely that we will get much better results during classification. In this algorithm we ignored
parameters that are the same for all digits (usually beginning or ending parameters). Also it is
possible to ignore some of the parameters less important, or even have negative impact on
learning process. Like it is shown, this method can be used for starting faze in handwritten digit
recognition algorithm, so it is possible to build on this method with further process of recognition.
References
[1] R. C. Gonzalez and R. E. Woods, Digital Image Processing, Prentice Hall, New Jersey, 2002.
[2] H. Caner and H. S. Gecim. Efficient Embedded Neural-Network-Based License Plate Recognition
System. IEEE Transactions on Vehicular Technology, Vol. 57, No.5, 2675-2683, 2008.
[3] R. Juntanasub, N. Sureerattanan.
Car license plate recognition through Hausdorff distance technique.
17th IEEE International Conference on Tools with Artificial Intelligence ICTAI 05, 2005.
[4] S. Abdleazeem and E. El-Sherif. Arabic handwritten digit recognition. International Journal of
Document Analysis and Recognition (IJDAR), Vol. 11, No. 3, 126-141, 2008.
[5] T. Loo-Nin and L. Kia-Foxk. Robust vision-based features and classification schemes for off-line
handwritten digit recognition. Pattern Recognition, Vol. 35, No. 11, 2344-2364, 2002.
[6] T. Yamaguchi, M. Maruyama, H. Miyao and Y. Nakano. Digit recognition in a natural scene with skew
and slant normalization. International Journal of Document Analysis and Recognition (IJDAR), Vol. 7,
No. 2-3, 168-177, 2005.
[7] J. Heikkonen and M. Mntynen. A computer vision approach to digit recognition on pulp bales. Pattern
Recognition Letters. Vol. 17, No. 4, 413-419, 1996.
[8] C.Y. Suen, K. Liu and N.W. Strathy. Sorting and recognizing cheques and financial documents,
Document Analysis Systems: Theory and Practice, Springer, Berlin, 173–187, 1999.
[9] S. Belongie, J. Malik and J. Puzicha. Shape matching and object recognition using shape contexts.
IEEE Trans. Pattern Anal. Mach. Intell. 24, Vol. 4, 509–522, 2002.
[10] Jukka H
J. X. Dong, A. Krzyzak and C. Y. Suen. A multi-net learning framework for pattern
recognition. Proceedings of the Sixth International Conference on Document Analysis and Recognition,
Seattle, 328–332, 2001.
[11] M. Kang and D. Palmer-Brown. A model learning adaptive function neural network applied to
handwritten digit recognition. Information Sciences, Vol. 178, No. 20, 3415-3429, 2007.
Eva TUBA
University of Belgrade
Faculty of Mathematics
Studentski trg 16, Belgrade
SERBIA
E-mail: eva.tuba@gmail.com
181
41
Fifth International Students Conference on Informatics
Imagination, Creativity, Design, Development
ICDD 2015, May 21-23
Sibiu, Romania
JPEG algorithm compression adjustment
Ira Tuba
Teacher Coordinator: Milan Tuba
Abstract
This paper describes JPEG algorithm with a focus on quantization. JPEG algorithm uses discrete
cosine transform on 8x8 blocks of the image to transform light intensity values to frequency
coefficients. Main compression is done by discarding less important coefficients. This is enabled by
integer division of DCT coefficients with corresponding values from the quantization matrix. After that
process many coefficients are rounded to zero. We have developed application that allows to
manipulate with a level of compression by choosing the values in quantization table.
1 Introduction
Digital images are part of our everyday life. They are used in journalism, medicine, in police
investigations, archaeology etc. Storing them have many benefits such as faster search through
database and easy processing [1], [2], [3]. An important feature of digital images is that they can
be easily processed by mathematical methods [5], [6]. However, one of the problems with digital
images is memory needed for saving them. A representation of digital image can require even tens
of megabytes. The solution for this problem is compression. Lossy compression algorithms
provide very high degree of compression and cause minimal quality loss.
Well known and widely used lossy compression algorithm is JPEG (Joint photographic experts
group). Degree of compression as well as quality of digital image is determined with quantization
matrix.
The software for JPEG algorithm compression adjustment is developed in C#. Application has
graphic user interface and it is very easy to work with it.
The next two section of this paper describe digital images and JPEG algorithm for compression.
In section 4 will be described application and shown some results of different compression rates.
2 Digital images
Digital image represents the projection of 3D world into a 2D rectangle. The rectangle is double
discretized, horizontally and vertically, and the result is the rectangle divided on small rectangles,
usually squares. This squares are named pixels. Resolution is the number of horizontal and
vertical pixels. Digitized images are also called raster images.
Raster image saves information about brightness and colour (for colour image) for each pixel.
For black and white images, one bit is needed for each pixel (0 – white, 1 – black). With n
bits, 2
n
shades can be described. RGB is common color model for digital images. Since human
eye has receptors for red, green and blue, RGB model corresponds to human eye naturally. In
182
147
Fifth International Students Conference on Informatics
–
ICDD 2015
May 21-23, 2015, Sibiu, Romania
this color model, each pixel is represented with three numbers (shades of red, green and blue
component). Color depth is the number of bits needed for describing the color per pixel.
Main characteristic that make compression possible is fact that image is not random written
numbers. Adjacent pixels have similar values. Compression is possible because of redundancy
of the images.
3 JPEG algorithm
In the JPEG algorithm we can choose the level of compression. The higher the compression is, the
lower the image quality is, and conversely.
The first step of the algorithm is block preparation. RGB model is the most often used color
model, but for image processing there are more appropriate models, such as, for example, YCbCr.
Component Y represents intensity, while Cb and Cr are hue (chrome) components.
As we mention earlier, human eye is less sensitive on shades of colors, so we will reduce color
components. Blocks of four pixels will be replaced with their average value. This will reduce the
size of the matrices for those components. We get a 50% reduced file size. This step is
irreversible, we can not reconstruct the original image.
Next step is dividing the image into 8x8 matrices. Then, two-dimension discrete cosine
transformation (DCT) is applied on each block of the image. DCT is a Fourier-related transform
appropriate for image processing.
Definition of DCT is described with Eq. 1:
(1)
Invers transform (IDCT) is given by Eq. 2:
7
0
7
0
16
1)* *
(2
*cos
16
1)* *
(2
(, )*cos
4
( )* * ( ( )
( , , )
i
j
y
y
x
x
Di j
C x x C C y
d x y
(2)
where C(u) are constants:
C(u) =
2
1
for u=0, and C(u)=1 for u>0.
DCT coefficients contain information about the composition of the image frequency. The first
coefficient is named DC component. It corresponds to average intensity values of 8x8 matrices
and also contains the most of the image information. The other 63 coefficients are called AC
components. DC component has the lowest frequency and coefficients closer to lower-right corner
has higher frequencies which are usually close to zero.
Matrix form of Eq. 2 is obtained from Eq.3:
, if i 0>
2
1)* *
(2
cos
N
2
, if i = 0
1
,
N
i
j
N
Ti j
(3)
For N=8, it results matrix T:
16
1)* *
(2
*cos
16
1)* *
(2
( , , )*cos
4
()* ( ( )
(, )
7
0
7
0
j
y
i
x
d x y
Ci C C j
Di j
x
y
183
82
Fifth International Students Conference on Informatics
–
ICDD 2015
May 21-23, 2015, Sibiu, Romania
T =
Matrix T is an orthogonal matrix which means that its inverse matrix is T’. Now we can apply DCT
on matrix M (block of the image previously reduced on range [-128, 127]).
D = TMT’
(4)
This step is reversible, i.e. applying the invers DCT obtain a starting matrix M. Some errors would
appear as a result of the rounding of a real value to the nearest integer.
3.1 Quantization
Quantization is the most important part of compression. It is conducted by quantization tables.
Values in this tables are pre-defined and determined according to human visual system.
Quantization is defined as the integer division of each DCT coefficient with the corresponding
coefficient of the quantization table (Eq. 5).
(, )
(, )
(, )
Qi j
Di j
round
Bi j
for i = 0, 1, … , 7; k = 0, 1, … , 7
… , 7
(5)
Experiment based on human visual system have resulted in the JPEG standard quantization matrix
which has a quality level 50 [4].
Q
50
=
After quantization, procedure follows a zigzag analysing of 63 AC coefficients for each block. At
the output of the quantization are obtained matrices with nonzero elements in the upper left
corner, while the other elements are equal to zero. Using zigzag analyses we get one-dimensional
matrix or vector. The first element is always the DC component, followed by AC components
different from zero. At the end of the series appears AC coefficients equal to zero.
Entropy coding is a special form of lossless compression, and the last step of the JPEG compression.
This is encoding based on static values of quantized coefficients. The JPEG standard specifies two
ways of coding: Huffman’s coding and arithmetical coding. The idea of Huffman’s coding is to
numbers that appear often represent with shorter code than those that appear less often.
.3536 .3536 .3536 .3536 .3536 .3536 .3536 .3536
6
.4904 .4157 .2778 .0975 -.0975 -.2778 -.4157 -.4904
904
.4619 .1913 -.1913 -.4619 -.4619 -.1913 .1913 .4619
619
.4157 -.0975 -.4904 -.2778 .2778 .4904 .0975 -.4175
175
.3536 -.3536 -.3536 .3536 .3536 -.3536 -.3536 .3536
536
.2778 -.4904 .0975 .4175 -.4175 -.0975 .4904 -.2778
778
.1913 -.4619 .4619 -.1913 -.1913 .4619 -.4619 .1913
913
.0975 -.2778 .4157 -.4904 .4904 -.4157 .2778 -.0975
975
16 11 10 16 24 40 51 61
61
12 12 14 19 26 58 60 55
55
14 13 16 24 40 57 69 56
56
14 17 22 29 51 87 80 62
62
18 22 37 56 68 109 103 77
77
24 35 55 64 81 104 113 92
92
49 64 78 87 103 121 120 101
101
72 92 95 98 112 100 130 99
99
184
31
Fifth International Students Conference on Informatics
–
ICDD 2015
May 21-23, 2015, Sibiu, Romania
Arithmetic coding gives a 5-10% better compression than Huffman's coding, but it is much more
complex .
To decode some image we need to implement all the previously mentioned steps in the reverse
order. As a result of decoding we get decompressed JPEG version of the image.
The first step is converting Huffman codes in the sequence. Then symbols from sequence are
extended to one 64 element long array of DCT coefficients for each pixel 8x8 blocks. Every quantized
DCT coefficient needs to be multiplied with corresponding value from quantization table. After that
coefficients are compiled from the "zigzag" sequence to original sequence. Last step is applying
inverse DCT which will reconstruct image samples.
4 Compression adjustment
Degree of compression and the level of losses is determined by the quantization matrix. As the
result of dividing each coefficient in the frequency domain by a corresponding value in the
quantization table many coefficients are rounded to zero and many of the others are small numbers,
so we can represent 8x8 matrix with many fewer bits. By choosing quantization table we can
manipulate with the level of compression.
If a number in quantization matrix is bigger than 128, the result after quantization can be 0 or 1,
so we need only one bit to encode that coefficient. If a number in quantization matrix is between 64
and 127, the result after quantization can be 0, 1, 2 or 3, i.e. we need only two bits to encode that
coefficient, etc.
In this application under the original image are two differently compressed images (Fig. 1). Above
them are shown quantization matrices which can be changed by user. This matrices are used for
compression. Images are in bitmap format size 512x512.
Fig. 1: Screen shot of the application
Interesting cases for our experiments are the ones with very high degree of compression. High
degree of compression is reached by excluding the majority of the coefficients and to exclude
a coefficient its corresponding value in the quantization matrix must be 255. On the other
185
27
Fifth International Students Conference on Informatics
–
ICDD 2015
May 21-23, 2015, Sibiu, Romania
hand, if a coefficient needs to be preserved its quantization value will be 1. We can also
reduce the number of bits required to record one coefficient by dividing it with a number
between 1 and 255.
For the first example we will take an extreme case of compression. Quantization table is set
so after dividing each DCT coefficient with corresponding value from quantization matrix DC
component stays the same and the others are rounded to zero. Now, the image will be
recorded with 8 bits. The result is 64:1 compression. Despite of high level of compression the
image is still recognizable, but the resolution is lower.
Fig. 2: The original
Fig. 3: Compression 64:1
By modifying previous example we can get even higher degree of compression. The first
coefficient in quantization table will not be 1, but, for example, a power of 2. If DC
component is divided with 2
5
, it can be represented with 3 bits and level of compression is
128:1 (Figure 4). If we choose 2
7
for corresponding quantization value, first coefficient will
require only 1 bit and we get compression in ratio 512:1 (Figure 5). With this levels of
compression image quality is poor but still acceptable.
Fig. 4: Compression 128:1
Fig. 5: Compression 512:1
186
23
Fifth International Students Conference on Informatics
–
ICDD 2015
May 21-23, 2015, Sibiu, Romania
It would be interesting to show a picture that is also recorded with 8 bits, but they are
differently arranged. Lower frequencies contain important information for the image and they
are placed in upper-left corner, so our 8 bits will be in that area. Each coefficient in 2x2
matrix located in mentioned corner will be divided with 128. The result is much better image
quality for the same level of compression (Figure 6).
Now we will represent image with first 4 frequency components kept as they are (divided with
1) and compare that image with previous image. Degree of compression is lower but the
resulting image is much better and very close to the original image (Figure 7).
Fig. 6: Compression 64:1
Fig. 7: Compression 16:1
For the next experiment we will keep matrix 3x3 located in upper- left corner but coefficients
will be divided on following way: DC component is divided with 32 and the others with 128,
so the total number of bits required for this image is 16. The quality of the resulting image is
miner than with compression from the preceding example (Figure 8). If we try to alleviate this
compression we will keep entire first coefficient. Now, 24 bits is needed (8 more) but image
quality is no better (Figure 9).
Fig. 8: Compression 32:1
Fig. 9: Compression 64:3
187
29
Fifth International Students Conference on Informatics
–
ICDD 2015
May 21-23, 2015, Sibiu, Romania
5 Conclusion
This application is useful for comparison of two images compressed with different quantization
matrix. Depending on how we change quantization values we can see various changes on the
image and draw conclusions.
In further development we can extract quantization matrices that are proved as “good”.
Quantization matrix is considered as “good” if it has high level of compression but not much
losses on image quality. To estimate how good the matrix is we can develop different metrics as,
for example, relation between numbers of bits required for compressed image and deviations of
that image from the original.
References
[1] R. C. Gonzalez and R. E. Woods, Digital Image Processing, Prentice Hall, New Jersey, 2002.
[2] W.K. Pratt, Digital Image Processing, PixelSoft, Los Altos, CA, 2007.
[3] S. Jayaraman, S. Esakkirajan and T. Veerakumar. Digital Image Processing. TMH, New Delhi, 2010.
[4] C. Y. Wang, S. M. Lee and L. W. Chang. Designing JPEG quantization tables based on human visual
system, Signal Processing: Image Communication 16, 501-506, 2001.
[5] R. Szelinski, Computer Vision Algorithms and Applications, Springer-Verlag, London, 812, 2011.
[6] W. Burger, M. J. Burge, Principles of Digital Image Processing, Springer-Verlag, London, 261, 2009.
Ira TUBA
Megatrend University
Faculty of Computer Science
Bulevar umetnosti 29, Belgrade
SERBIA
E-mail: ira.tuba@gmail.com
com
188
35
Fifth International Students Conference on Informatics
Imagination, Creativity, Design, Development
ICDD 2015, May 21-23
Sibiu, Romania
Blur detection in digital images
Viktor Tuba
Teacher Coordinator: Milan Tuba
Abstract
One of the common irregularities in digital images is blur. Usually it is caused by the motion or out
of focus. In this paper we present an algorithm and application for detecting images with blur or
blurred regions within an image. Blur is detected by different algorithms for edge detection or with
high pass filters in frequency domain. Additional classification is introduced to determine more
precisely blurred regions.
1 Introduction
For decades after invention of camera, making quality images was very difficult and time
consuming. It required high skill levels out of people who made them and lot of expensive
equipment to properly develop image from film to paper. Things started to change rapidly with
invention of digital photo sensors. After photo sensors and computer science for image processing
developed enough, digital cameras took over almost completely. Now we can see results of taking
image almost immediately. It makes a lot easier to make better images. Even more it is possible to
process images after taking them and remove various imperfections afterwards. There are many
algorithms being developed for various areas [1], [2], [3]. One of such areas is blur. Common use
of algorithms for blur detection is for discovering forged documents [4], [5].
We say that image is blurred when it does not have sharp edges on at least some part of the image.
Sometimes blur is deliberately created by photographer for artistic purposes, and sometimes it is
created by mistake. In this paper we are concerned with blur that is made by mistake.
2 Blur
There are two causes for blur in images, motion blur and out of focus. Motion blur can occur due
to subject movement or because of camera shaking. When during the recording of one frame
capture changes due to rapid movement occurs motion blur. Also, motion blur may occur due to
shaking of the camera while shooting pictures. Fig. 1 are examples of images with motion blur.
Blurred regions can also appear because they are out of focus. Today, this very rarely blur the
whole image. Usually one part of the image is clear and a few regions are out of focus. In Fig. 2
we can see that the left side of image, flowers, is in focus and the other parts of image is blurry.
189
Documents you may be interested
Documents you may be interested