外文翻譯--利用離散余弦變換進(jìn)行圖像壓縮_第1頁
已閱讀1頁,還剩23頁未讀, 繼續(xù)免費(fèi)閱讀

下載本文檔

版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進(jìn)行舉報(bào)或認(rèn)領(lǐng)

文檔簡介

1、<p><b>  中文4500字</b></p><p>  Image Compression Using the Discrete Cosine Transform</p><p><b>  Abstract</b></p><p>  The discrete cosine transform (DCT

2、) is a technique for converting a signal into elementary frequency components. It is widely used in image compression. Here we develop some simple functions to compute the DCT and to compress images. </p><p>

3、;  These functions illustrate the power of Mathematica in the prototyping of image processing algorithms.</p><p>  The rapid growth of digital imaging applications, including desktop publishing, multimedia,

4、teleconferencing, and high-definition television (HDTV) has increased the need for effective and standardized image compression techniques. Among the emerging standards are JPEG, for compression of still images [Wallace

5、1991]; MPEG, for compression of motion video [Puri 1992]; and CCITT H.261 (also known as Px64), for compression of video telephony and teleconferencing.</p><p>  All three of these standards employ a basic t

6、echnique known as the discrete cosine transform (DCT). Developed by Ahmed, Natarajan, and Rao [1974], the DCT is a close relative of the discrete Fourier transform (DFT). Its application to image compression was pioneere

7、d by Chen and Pratt [1984]. In this article, I will develop some simple functions to compute the DCT and show how it is used for image compression. We have used these functions in our laboratory to explore methods of opt

8、imizing image c</p><p>  The One-Dimensional Discrete Cosine Transform</p><p>  The discrete cosine transform of a list of n real numbers s(x), x = 0, ..., n-1, is the list of length n given by:

9、</p><p>  s(u)= C(u) u=0,…n</p><p>  where for u=0</p><p>  =1 otherwise</p><p>  Each element of the transformed list S(u) is the inner (dot) product of the

10、 input list s(x) and basis vector. The constant factors are chosen so that the basis vectors are orthogonal and normalized. The eight basis vectors for n = 8 are shown in Figure 1. The DCT can be writthe product of a vec

11、tor (the input list) and the n x n orthogonal matrix whose rows are the vectors. This matrix, for n = 8, can be computed as follows:</p><p>  DCTMatrix =</p><p>  Table[ If k==0,</p><

12、p>  Sqrt[1/8],</p><p>  Sqrt[2/8] Cos[Pi (2j+1) k/16] ],</p><p>  {k,0,7},{j,0,7}] // N;</p><p>  We can check that the matrix is orthogonal:</p><p>  DCTMatrix .

13、 Transpose[DCTMatrix] // Chop // MatrixForm</p><p>  1. 0 0 0 0 0 0 0</p><p>  0 1. 0 0 0 0 0 0</p><p>  0 0 1. 0 0 0 0 0

14、</p><p>  0 0 0 1. 0 0 0 0</p><p>  0 0 0 0 1. 0 0 0</p><p>  0 0 0 0 0 1. 0 0</p><p>  0 0 0 0

15、0 0 1. 0</p><p>  0 0 0 0 0 0 0 1.</p><p>  Each basis vector corresponds to a sinusoid of a certain frequency:</p><p>  Show[GraphicsArray[Partition[&l

16、t;/p><p>  ListPlot[#, PlotRange -> {-.5, .5}, PlotJoined -> True, </p><p>  DisplayFunction -> Identity]& </p><p>  /@ DCTMatrix, 2] ]];</p><p>  Figure 1

17、. The eight basis vectors for the discrete cosine transform of length eight.</p><p>  The list s(x) can be recovered from its transform S(u) by applying the inverse cosine transform </p><p><

18、b>  (IDCT):</b></p><p>  = x=0,…,n</p><p>  where for u=0</p><p>  =1 otherwise</p><p>  This equation expresses s as a linear combination of the

19、 basis vectors. The coefficients are the elements of the transform S, which may be regarded as reflecting the amount of each frequency present in the inputs.</p><p>  We generate a list of random numbers to

20、serve as a test input:</p><p>  input1 = Table[Random[Real, {-1, 1}], {8}]</p><p>  {0.203056, 0.980407, 0.35312, -0.106651, 0.0399382, 0.871475, -0.648355, 0.501067}</p><p>  The D

21、CT is computed by matrix multiplication:</p><p>  output1 = DCTMatrix . input1</p><p>  {0.775716, 0.3727, 0.185299, 0.0121461, -0.325, -0.993021, 0.559794, -0.625127}</p><p>  As n

22、oted above, the DCT is closely related to the discrete Fourier transform (DFT). In fact, it is possible to compute the DCT via the DFT (see [Jain 1989, p. 152]): First create a new list by extracting the even elements, f

23、ollowed by the reversed odd elements. Then multiply the DFT of this re-ordered list by so-called "twiddle factors" and take the real part. We can carry out this process for n = 8 using Mathematica's DFT fun

24、ction.</p><p>  DCTTwiddleFactors = N @ Join[{1}, </p><p>  Table[Sqrt[2] Exp[-I Pi k /16], {k, 7}]]</p><p>  {1., 1.38704 - 0.275899 I, 1.30656 - 0.541196 I, 1.17588 - 0.785695 I,

25、 1. - 1. I, 0.785695 - 1.17588 I, 0.541196 - 1.30656 I, 0.275899 - 1.38704 I}</p><p>  The function to compute the DCT of a list of length n = 8 is then:</p><p>  DCT[list_] := Re[ DCTTwiddleFac

26、tors *</p><p>  InverseFourier[N[list[[{1, 3, 5, 7, 8, 6, 4, 2}]]]]]</p><p>  Note that we use the function InverseFourier to implement what is usually in engineering called the forward DFT. Lik

27、ewise, we use Fourier to implement what is usually called the inverse DFT. The function N is used to convert integers to reals because (in Version 2.2) Fourier and InverseFourier are not evaluated numerically when their

28、 arguments are all integers. The special case of a list of zeros needs to be handled separately by overloading the functions, since N of the integer 0 is an integer</p><p>  Unprotect[Fourier, InverseFourier

29、];</p><p>  Fourier[x:{0 ..}]:= x;</p><p>  InverseFourier[x:{0 ..}]:= x;</p><p>  Protect[Fourier, InverseFourier];</p><p>  We apply DCT to our test input and compare

30、 it to the earlier result computed by matrix multiplication. To compare the results, we subtract them and apply the Chop function to suppress values very close to zero:</p><p>  DCT[input1]</p><p&

31、gt;  {0.775716, 0.3727, 0.185299, 0.0121461, -0.325, -0.993021, 0.559794, -0.625127}% - output1 // Chop{0, 0, 0, 0, 0, 0, 0, 0}</p><p>  The inverse DCT can be computed by multiplication with the inverse of

32、 the DCT matrix. We illustrate this with our previous example:</p><p>  Inverse[DCTMatrix] . output1</p><p>  {0.203056, 0.980407, 0.35312, -0.106651, 0.0399382, 0.871475, -0.648355, 0.501067}%

33、- input1 // Chop{0, 0, 0, 0, 0, 0, 0, 0}</p><p>  As you might expect, the IDCT can also be computed via the inverse DFT. The "twiddle factors" are the complex conjugates of the DCT factors and the

34、 reordering is applied at the end rather than the beginning:</p><p>  IDCTTwiddleFactors = Conjugate[DCTTwiddleFactors]</p><p>  {1., 1.38704 + 0.275899 I, 1.30656 + 0.541196 I, 1.17588 + 0.7856

35、95 I, 1. + 1. I, 0.785695 + 1.17588 I, 0.541196 + 1.30656 I, 0.275899 + 1.38704 I}</p><p>  IDCT[list_] := Re[Fourier[</p><p>  IDCTTwiddleFactors list] ][[{1, 8, 2, 7, 3, 6, 4, 5}]]</p>

36、<p>  For example:</p><p>  IDCT[DCT[input1]] - input1 // Chop</p><p>  {0, 0, 0, 0, 0, 0, 0, 0}</p><p>  The Two-Dimensional DCT</p><p>  The one-dimensional DCT

37、 is useful in processing one-dimensional signals such as speech waveforms. For analysis of two-dimensional (2D) signals such as images, we need a 2D version of the DCT. For an n x m matrix s, the 2D DCT is computed in a

38、simple way: The 1D DCT is applied to each row of s and then to each column of the result. Thus, the transform of s is given by</p><p>  u=0,…n v=0,…m</p><p>  where for u=0</p><p&g

39、t;  =1 otherwise</p><p>  Since the 2D DCT can be computed by applying 1D transforms separately to the rows and columns, we say that the 2D DCT is separable in the two dimensions.As in the one-dimensio

40、nal case, each element S(u, v) of the transform is the inner product of the input and a basis function, but in this case, the basis functions are n x m matrices. Each two-dimensional basis matrix is the outer product of

41、two of the one-dimensional basis vectors. For n = m = 8, the following expression creates an 8 x 8 array </p><p>  DCTTensor = Array[</p><p>  Outer[Times, DCTMatrix[[#1]], DCTMatrix[[#2]]]&

42、, {8, 8}];</p><p>  Each basis matrix can be thought of as an image. The 64 basis images in the array are shown in Figure 2. </p><p>  The package GraphicsImage.m, included in the electronic s

43、upplement, contains the functions GraphicsImage and ShowImage to create and display a graphics object from a given matrix. GraphicsImage uses the built-in function Raster to translate a matrix into an array of gray cells

44、. The matrix elements are scaled so that the image spans the full range of graylevels. An optional second argument specifies a range of values to occupy the full grayscale; values outside the range are clipped. The funct

45、ion</p><p>  << GraphicsImage.m</p><p>  Show[GraphicsArray[</p><p>  Map[GraphicsImage[#, {-.25, .25}]&, </p><p>  Reverse[DCTTensor], {2}] ]];</p>

46、<p>  Figure 2. The 8 x 8 array of basis images for the two-dimensional discrete cosine transform.</p><p>  Each basis matrix is characterized by a horizontal and a vertical spatial frequency. The matri

47、ces shown here are arranged left to right and bottom to top in order of increasing frequencies.To illustrate the 2D transform, we apply it to an 8 x 8 image of the letter A:</p><p>  ShowImage[ input2 =</

48、p><p>  {{0, 1, 0, 0, 0, 1, 0, 0}, {0, 1, 0, 0, 0, 1, 0, 0},</p><p>  {0, 1, 1, 1, 1, 1, 0, 0}, {0, 1, 0, 0, 0, 1, 0, 0},</p><p>  {0, 0, 1, 0, 1, 0, 0, 0}, {0, 0, 1, 0, 1, 0, 0, 0},&l

49、t;/p><p>  {0, 0, 1, 0, 1, 0, 0, 0}, {0, 0, 0, 1, 0, 0, 0, 0}}]</p><p>  As in the 1D case, it is possible to express the 2D DCT as an array of inner products (a tensor contraction):</p><

50、;p>  output2 = Array[</p><p>  (Plus @@ Flatten[DCTTensor[[#1, #2]] input2])&,{8, 8}];</p><p>  ShowImage[output2]</p><p>  The pixels in this DCT image describe the proporti

51、on of each two-dimensional basis function present in the input image. The pixels are arranged as above, with horizontal and vertical frequency increasing from left to right and bottom to top, respectively. The brightest

52、pixel in the lower left corner is known as the DC term, with frequency {0, 0}. It is the average of the pixels in the input, and is typically the largest coefficient in the DCT of "natural" images.An inverse 2D

53、 IDCT can also be co</p><p>  Since the two-dimensional DCT is separable, we can extend our function DCT to the case of two-dimensional input as follows:</p><p>  DCT[array_?MatrixQ] :=</p&g

54、t;<p>  Transpose[DCT /@ Transpose[DCT /@ array] ]</p><p>  This function assumes that its input is an 8 x 8 matrix. It takes the 1D DCT of each row, transposes the result, takes the DCT of each new r

55、ow, and transposes again. This function is more efficient than computing the tensor contraction shown above, since it exploits the built-in function InverseFourier.</p><p>  We compare the result of this fun

56、ction to that obtained using contraction of tensors :</p><p>  DCT[input2] - output2 // Chop // Abs // Max 0</p><p>  The definition of the inverse 2D DCT is straightforward:</p><p>

57、;  IDCT[array_?MatrixQ] :=</p><p>  Transpose[IDCT /@ Transpose[IDCT /@ array] ]</p><p>  As an example, we invert the transform of the letter A:</p><p>  ShowImage[Chop[IDCT[output

58、2]]];</p><p>  As noted earlier, the components of the DCT output indicate the magnitude of image components at various 2D spatial frequencies. To illustrate, we can set the last row and column of the DCT o

59、f the letter A equal to zero:</p><p>  output2[[8]] = Table[0, {8}];</p><p>  Do[output2[[i, 8]] = 0, {i, 8}];</p><p>  Now take the inverse transform:</p><p>  ShowIma

60、ge[Chop[IDCT[output2]]];</p><p>  The result is a blurred letter A: the highest horizontal and vertical frequencies have been removed. This is easiest to see when the image is reduced in size so that individ

61、ual pixels are not as visible.</p><p>  Quantization</p><p>  DCT-based image compression relies on two techniques to reduce the data required to represent the image. The first is quantization o

62、f the image's DCT coefficients; the second is entropy coding of the quantized coefficients. Quantization is the process of reducing the number of possible values of a quantity, thereby reducing the number of bits nee

63、ded to represent it. Entropy coding is a technique for representing the quantized data as compactly as possible. We will develop functions to quantize i</p><p>  A simple example of quantization is the round

64、ing of reals into integers. To represent a real number between 0 and 7 to some specified precision takes many bits. Rounding the number to the nearest integer gives a quantity that can be represented by just three bits.&

65、lt;/p><p>  x = Random[Real, {0, 7}] 2.78452</p><p>  Round[x] 3</p><p>  In this process, we reduce the number of possible values of the quantity (and thus the number of bits needed t

66、o represent it) at the cost of losing information. A "finer" quantization, that allows more values and loses less information, can be obtained by dividing the number by a weight factor before rounding:</p>

67、;<p><b>  w = 1/4;</b></p><p>  Round[x/w] 11</p><p>  Taking a larger value for the weight gives a "coarser" quantization.</p><p>  Dequantization, whi

68、ch maps the quantized value back into its original range (but not its original precision) is acheived by multiplying the value by the weight:</p><p>  w * % // N 2.75</p><p>  The quantization e

69、rror is the change in a quantity after quantization and dequantization. The largest possible quantization error is half the value of the quantization weight.</p><p>  In the JPEG image compression standard,

70、each DCT coefficient is quantized using a weight that depends on the frequencies for that coefficient. The coefficients in each 8 x 8 block are divided by a corresponding entry of an 8 x 8 quantization matrix, and the re

71、sult is rounded to the nearest integer.</p><p>  In general, higher spatial frequencies are less visible to the human eye than low frequencies. Therefore, the quantization factors are usually chosen to be l

72、arger for the higher frequencies. The following quantization matrix is widely used for monochrome images and for the luminance component of a color image. It is given in the JPEG standards documents, yet is not part of t

73、he standard, so we call it the "de facto" matrix:</p><p><b>  qLum = </b></p><p>  {{16, 11, 10, 16, 24, 40, 51, 61},</p><p>  {12, 12, 14, 19, 26, 58, 60, 55}

74、,</p><p>  {14, 13, 16, 24, 40, 57, 69, 56},</p><p>  {14, 17, 22, 29, 51, 87, 80, 62},</p><p>  {18, 22, 37, 56, 68,109,103, 77},</p><p>  {24, 35, 55, 64, 81,104,113,

75、 92},</p><p>  {49, 64, 78, 87,103,121,120,101},</p><p>  {72, 92, 95, 98,112,100,103, 99}};</p><p>  Displaying the matrix as a grayscale image shows the dependence of the quantiza

76、tion factors on the frequencies:</p><p>  ShowImage[qLum];</p><p>  To implement the quantization process, we must partition the transformed image into 8 x 8 blocks:</p><p>  BlockI

77、mage[image_, blocksize_:{8, 8}] :=</p><p>  Partition[image, blocksize] /; </p><p>  And @@ IntegerQ /@ (Dimensions[image]/blocksize)</p><p>  The function UnBlockImage reassembles

78、the blocks into a single image:</p><p>  UnBlockImage[blocks_] :=</p><p>  Partition[ Flatten[Transpose[blocks, {1, 3, 2}]], </p><p>  {Times @@ Dimensions[blocks][[{2, 4}]]}]</p

79、><p>  For example:</p><p>  Table[i + 8(j-1), {j, 4}, {i, 6}] // MatrixForm</p><p>  1 2 3 4 5 6</p><p>  9 10 11 12 13 14</p><p> 

80、 17 18 19 20 21 22</p><p>  25 26 27 28 29 30</p><p>  BlockImage[%, {2, 3}] // MatrixForm</p><p>  1 2 3 4 5 6</p><p>  9 10 11 12 13

81、 14</p><p>  17 18 19 20 21 22</p><p>  25 26 27 28 29 30</p><p>  UnBlockImage[%] // MatrixForm</p><p>  1 2 3 4 5 6</p><p>  9

82、 10 11 12 13 14</p><p>  17 18 19 20 21 22</p><p>  25 26 27 28 29 30</p><p>  Our quantization function blocks the image, divides each block (element-b

83、y-element) by the quantization matrix, reassembles the blocks, and then rounds the entries to the nearest integer:</p><p>  DCTQ[image_, qMatrix_] := Map[(#/qMatrix)&, </p><p>  BlockImage[

84、image, Dimensions[qMatrix]], {2}] // UnBlockImage // Round</p><p>  The dequantization function blocks the matrix, multiplies each block by the quantization factors, and reassembles the matrix:</p>

85、<p>  IDCTQ[image_, qMatrix_] :=Map[(# qMatrix)&, </p><p>  BlockImage[image, Dimensions[qMatrix]], {2}] // UnBlockImage</p><p>  To show the effect of quantization, we will transform,

86、 quantize, and reconstruct our image of the shuttle using the quantization matrix introduced above:</p><p>  qshuttle = shuttle // </p><p>  DCT // DCTQ[#, qLum]& // IDCTQ[#, qLum]& //

87、IDCT;</p><p>  For comparison, we show the original image together with the quantized version:</p><p>  Show[GraphicsArray[</p><p>  GraphicsImage[#, {0, 255}]& /@ {shuttle, qsh

88、uttle}]];</p><p>  Note that some artifacts are visible, particularly around high-contrast edges. In the next section, we will compare the visual effects and the amount of compression obtained from different

89、 degrees of quantization.</p><p><b>  Entropy</b></p><p>  To measure how much compression is obtained from a quantization matrix, we use a famous theorem of Claude Shannon [Shannon

90、and Weaver 1949]. The theorem states that for a sequence of symbols with no correlations beyond first order, no code can be devised to represent the sequence that uses fewer bits per symbol than the first-order entropy,

91、which is given by </p><p>  where p is the relative frequency of the ith symbol.</p><p>  To compute the first-order entropy of a list of numbers, we use the function Frequencies, from the

92、standard package Statistics`DataManipulation`. This function computes the relative frequencies of elements in a list:</p><p>  (shac poisson) In[1]:= (7/12/94 at 8:58:26 AM)</p><p>  Frequencie

93、s[list_List] := Map[{Count[list, #], #}&, Union[list]]</p><p>  Characters["mississippi"]{m, i, s, s, i, s, s, i, p, p, i}</p><p>  Frequencies[%]{{4, i}, {1, m}, {2, p}, {4, s}}&l

94、t;/p><p>  Calculating the first-order entropy is straightforward:</p><p>  Entropy[list_] := - Plus @@ N[# Log[2, #]]& @ </p><p>  (First[Transpose[Frequencies[list]]]/Length[lis

95、t])</p><p>  For example, the entropy of a list of four distinct symbols is 2, so 2 bits are required to code each symbol:</p><p>  Entropy[{"a", "b", "c", "d&

96、quot;}]2.</p><p>  Similarly, 1.82307 bits are required for this longer list with four symbols:</p><p>  Entropy[Characters["mississippi"]] 1.82307</p><p>  A list with mo

97、re symbols and fewer repetitions requires more bits per symbol:</p><p>  Entropy[Characters["california"]]2.92193</p><p>  The appearance of fractional bits may be puzzling to some rea

98、ders, since we think of a bit as a minimal, indivisible unit of information. Fractional bits are a natural outcome of the use of what are called "variable word-length" codes. Consider an image containing 63 pix

99、els with a greylevel of 255, and one pixel with a graylevel of 0. If we employed a code which used a symbol of length 1 bit to represent 255, and a symbol of length 2 bits to represent 0, then we would need 65 bits to re

100、present t</p><p>  The compression ratio is another frequently used measure of how effectively an image has been compressed. It is simply the ratio of the size of the image file before and after compression.

101、 It is equal to the ratio of bit-rates, in bits/pixel, before and after compression. Since the initial bit-rateis usually 8 bits/pixel,and the entropy is our estimate of the compressed bit-rate, the compression ratio is

102、estimated by 8/entropy.</p><p>  We will use the following function to examine the effects of quantization:</p><p>  f[image_, qMatrix_] :=</p><p>  {Entropy[Flatten[#]], IDCT[IDCTQ

103、[#, qMatrix]]}& @</p><p>  DCTQ[DCT[image], qMatrix]</p><p>  This function transforms and quantizes an image, computes the entropy, and dequantizes and reconstructs the image. It returns th

104、e entropy and the resulting image. A simple way to experiment with different degrees of quantization is to divide the "de facto" matrix qLum by a scalar and look at the results for various values of this parame

105、ter:</p><p>  test = f[shuttle, qLum/#]& /@ {1/4, 1/2, 1, 4}; </p><p>  Here are the reconstructed images and the corresponding entropies:</p><p>  Show[GraphicsArray[ Partition

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 眾賞文庫僅提供信息存儲(chǔ)空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

評論

0/150

提交評論