-
Notifications
You must be signed in to change notification settings - Fork 0
Lesson7
contined from lession6
We already know the process of decoding in previous lession. Repeat it here.
entropy decoding -> dequantation -> IDCT -> data merging/ up sample -> YUV to RGB
Explain the key concept about the decoding as a pre- knowledge.
A guide on the web. See how it is encoded, and then the data stored. For decoding the data we got from above, the next thing we should care about is how the compressed data organised.
RGB vs YUV / YCrCb
As we should know usually a pixel should have three componets RGB for display. However for storage, usually the colorspace of YUV is used. SOF will tell us the number of components it has. If we have one component, it will have Y component only, the picture will be a greyscale image. Usualy we will have 3 components for a colorful image. But we could have 4 components, then the colorspace is CMYK, which is not popular on the web anyway.
In SOS, we also got the Horizontal sampling factor and Vertical sampling factor for each component. These parameters will tell us the how many elements a component will take in.
see ITU-81 , B.2.3 , the example is we have three component, H and V factors are listed below.
Component 0. H0 = 4, V0 = 1
Component 1. H1 = 1, V1 = 2
Component 2. H2 = 2, V2 = 2
Then the summation of Hj × Vj is (4 × 1) + (1 × 2) + (2 × 2) = 10.
So the YUV will have 4 Y, 2 U, 4 V stored for each MCU pixel. It is not a real situation. Usually UV shares the same HV value. We can explain the real ones like:
- YUV444 : For all YUV H = 2, V = 2, not down sample
- YUV422 : For Y H = 2, V = 2, For UV H = 1, V = 2
- YUV420 : For Y H = 2, V = 2, For UV H = 1, V = 1
- YUV411 : For Y H = 4, V = 1, For UV H = 1, V = 1
minimum coded units, which is 8 * 8 in jpeg. No bigger nor smaller. An image should be divided by the 8 * 8 square into many MCUs. The is unit we store data for each componet. But as we metioned above, samples for YUV are not equal. We may have 4 MCU Y to 1 MCU U and 1 MCU V.
MCU
for (int vi = 0; vi < v; vi ++) {
for (int hi = 0; hi < h; hi ++) {
//entropy decoding data in size 64.
}
}
Inside a MCU, we got 64 YUV-component/colorspace data, not a pixel.
All data was stored in the sequences with the index in the MCU 8 * 8 array.
see itu-t81 Figure A.6 – Zig-zag sequence of quantized DCT coefficients which would be like
| 0 | 1 | 5 | 6 | 14 | 15 | 27 | 28 |
| 2 | 4 | 7 | 13 | 16 | 26 | 29 | 42 |
| 3 | 8 | 12 | 17 | 25 | 30 | 41 | 43 |
| 9 | 11 | 18 | 24 | 31 | 40 | 44 | 53 |
| 10 | 19 | 23 | 32 | 39 | 45 | 52 | 54 |
| 20 | 22 | 33 | 38 | 46 | 51 | 55 | 60 |
| 21 | 34 | 37 | 47 | 50 | 56 | 59 | 61 |
| 35 | 36 | 48 | 49 | 57 | 58 | 62 | 63 |
We can get the dequantation table from DQT``. Now any data decoded from entropy decoder is placed in a MCU. Observe it we can find that they are all small values. We need to dequant it with values int the DQT```.
Just multiply the values from DQT and decoded data. Just remember the data are in the zigzag sequence.
So zigzag access the DQT we can get the same sequence as decoded data and do the multiplication for them.
That man from France proved, any peridic function can be expanded to a sum of trigonometric functions, usually as a sum of sines and cosines.
Most signals are discrete time samples and only have limit N samples. If we take it as a periodic but just one cycle signal.
Then we got DFT, which is Discrete Fourier Transform for short, it is
The most common way we see it as a transform from a sequence of discrete time samples into a sequence of frequency samples which is a complex-valued function of frequency. They have the same number of samples.
If we put it in the form of sum of sines and cosines, then DFT will be like:
From DFT, we could express a sequence of time samples into a sequence of frequency samples. But complexd value is not convinient for us.
We should know that any sines and cosines would be fit for the equation. So we can use only the real numbers by make the sines values to 0.
Put it this way, which is usually called DCT-II:
where the value
It is inverse calculation for the DCT which is almost DCT-III, but need a multuply by sqrt(2/N)
where the value
That's all. We can now write the IDCT functions for decoding now. Since we have 8 * 8 size numbers as samples.
For a 2-D N * N IDCT/DCT operation, it is a combination N * 1-D N IDCT/DCT operation.
Also we can use a trick here, save the results not in double float but in integers. This can speed up the multiplications.
We can precompute the cos values for accerlating IDCT functions, using python do the calculation:
#!/bin/env python3
import math
cos = math.cos
pi = math.pi
sqrt = math.sqrt
def e(n) -> float :
return sqrt(0.5) if n == 0 else 1.0
def idct(x : int, u: int) -> int:
# the 13 bit precision scaled integer instead of double float
res_float = e(u) * cos((2 * x + 1) * u * pi / 16.0) * sqrt(2)
return int(res_float * (1 << 13))
for x in range(8):
coefs = ['{:8}'.format(dct(x, u)) for u in range(8)]
print(', '.join(coefs + ['']))
The matrixs is
8192, 11362, 10703, 9632, 8192, 6436, 4433, 2260,
8192, 9632, 4433, -2260, -8192, -11362, -10703, -6436,
8192, 6436, -4433, -11362, -8192, 2260, 10703, 9632,
8192, 2260, -10703, -6436, 8191, 9632, -4433, -11362,
8192, -2260, -10703, 6436, 8192, -9632, -4433, 11362,
8192, -6436, -4433, 11362, -8191, -2260, 10703, -9632,
8192, -9632, 4433, 2260, -8191, 11362, -10703, 6436,
8192, -11362, 10703, -9632, 8191, -6436, 4433, -2260,
As we should notice, we are doing 2d IDCT, it can be implemented by 1d IDCT by
intermedia buff[64]
//for each column
for (int x = 0; x < 8; ++x) {
do_1d_idct(in+x, buff);
}
//for each row
for (int x = 0; x < 8; ++x) {
do_id_idct(buff[8*x], out);
}
Keep in mind that the results was scaled up with (1<< 13), it should be scaled down with (1<<13) too. But we did not count the sqrt(N) in the calculation. Put it back with a multiply by sqrt(8)/(1<<13). Or we can just do the shift part with >> 11. The round up would ignore the left precisions.
Now after we do IDCT for each component in the MCUs, we will get Y U V data seperately.
convert YUV to RGB
see JFIF page 2, Conversion to and from RGB.
According to the spec, we are using CCIR Recommendation 601. So you may see different conversion algrithm in other tutorial, it depends on which CCIR standard it uses. Here jpeg use the 601.
R = Y + 1.402 (Cr-128)
G = Y - 0.34414 (Cb-128) - 0.71414 (Cr-128)
B = Y + 1.772 (Cb-128)
Now feed the RGB to SDL, we can see it now.
- Claim no rights on posts. CC @junka
- Have fun with low level details.
- Fill an issue if you like.
