Skip to content

Lesson4

junka edited this page Jan 11, 2024 · 9 revisions

see PNG spec

Try not to repeat the PNG spec here.

format structure

The PNG data stream consists of a PNG signature (see 5.2: PNG signature) followed by a sequence of chunks (see clause 11: Chunk specifications). Each chunk has a chunk type which specifies its function.

The first eight bytes of a PNG datastream always contain the following (decimal) values:

137 80 78 71 13 10 26 10

There are 18 chunk types defined in this International Standard. Chunk types are four-byte sequences. The first four are termed critical chunks, which shall be understood and correctly interpreted according to the provisions of this International Standard. These are:

  • IHDR: image header, which is the first chunk in a PNG datastream.
  • PLTE: palette table associated with indexed PNG images.
  • IDAT: image data chunks.
  • IEND: image trailer, which is the last chunk in a PNG datastream.

The IHDR chunk shall be the first chunk in the PNG datastream. It contains:

field length
Width 4 bytes
Height 4 bytes
Bit depth 1 byte
Colour type 1 byte
Compression method 1 byte
Filter method 1 byte
Interlace method 1 byte

Bit depth is a single-byte integer giving the number of bits per sample or per palette index (not per pixel).

All png integers are in network byte orders. byteorder

Won't list all. parse the structure like


    fread(&length, 1, sizeof(uint32_t), f);
    fread(&chunk_type, 1, sizeof(uint32_t), f);
    length = ntohl(length);

    while(length && chunk_type != CHARS2UINT("IEND")) {
        switch (chunk_type) {
            case CHARS2UINT("IHDR"):
                fread(&b->ihdr, 1, sizeof(struct png_ihdr), f);
                b->ihdr.width = ntohl(b->ihdr.width);
                b->ihdr.height = ntohl(b->ihdr.height);
                break;
            case CHARS2UINT("PLTE"):
                b->palette = malloc(length);
                fread(b->palette, sizeof(struct color), length/sizeof(struct color), f);
                break;
            case CHARS2UINT("IDAT"):

            ...
        }
        
        fread(&crc, 1, sizeof(uint32_t), f);
        fread(&length, 1, sizeof(uint32_t), f);
        fread(&chunk_type, 1, sizeof(uint32_t), f);
        length = ntohl(length);
    }
    fread(&crc, 1, sizeof(uint32_t), f);

In a PNG file, the concatenation of the contents of all the IDAT chunks makes up a zlib datastream.

decompress

PNG compression method 0 is deflate/inflate compression with a sliding window (which is an upper bound on the distances appearing in the deflate stream) of at most 32768 bytes. Deflate compression is an LZ77 derivative.

We can use libz to uncompress the iDAT segments. Or we should write a simple lz77 decoder.

For lz77 algorithm, the paper is A Universal Algorithm for Sequential Data Compression

LZ77 maintains a sliding window during compression. A quick reference from wiki

LZ77 preseudo code

while input is not empty do
    match := longest repeated occurrence of input that begins in window
    
    if match exists then
        d := distance to start of match
        l := length of match
        c := char following match in input
    else
        d := 0
        l := 0
        c := first char of input
    end if
    
    output (d, l, c)
    
    discard l + 1 chars from front of window
    s := pop l + 1 chars from front of input
    append s to back of window
repeat

PNG will use a LZ77 derivative. see deflate

	  0   1
         +---+---+
         |CMF|FLG|   (more-->)
         +---+---+
(if FLG.FDICT set)

           0   1   2   3
         +---+---+---+---+
         |     DICTID    |   (more-->)
         +---+---+---+---+

         +=====================+---+---+---+---+
         |...compressed data...|    ADLER32    |
         +=====================+---+---+---+---+

CMF (Compression Method and flags):

  • bits 0 to 3 CM Compression method(CM = 8 denotes the "deflate" compression method with a window size up to 32K. )
  • bits 4 to 7 CINFO Compression info (For CM = 8, CINFO is the base-2 logarithm of the LZ77 window size, minus eight (CINFO=7 indicates a 32K window size).

FLG (FLaGs)

  • bits 0 to 4 FCHECK check bits for CMF and FLG (must be (CMF * 256 + FLG))
  • bit 5 FDICT preset dictionary (tell us if DICTID in other words preset dictionary exists, PNG did not require)
  • bits 6 to 7 FLEVEL compression level (not needed for decompression; it is there to indicate if recompression might be worthwhile.)

So we will have to read 2 bytes header and 4 tailer for PNG iDAT stream. The rest data is in deflate compressed data.

Each block of compressed data begins with 3 header bits containing the following data:
    first bit       BFINAL  (set if and only if this is the last block of the data set)
    next 2 bits     BTYPE   (specifies how the data are compressed)
            00 - no compression
            01 - compressed with fixed Huffman codes
            10 - compressed with dynamic Huffman codes
            11 - reserved (error)
    
    do
        read block header from input stream.
        if stored with no compression
            skip any remaining bits in current partially processed byte
            read LEN and NLEN (see next section)
            copy LEN bytes of data to output
        otherwise
            if compressed with dynamic Huffman codes
                read representation of code trees
            loop (until end of block code recognized)
                decode literal/length value from input stream
                if value < 256
                    copy value (literal byte) to output stream
                otherwise
                    if value = end of block (256)
                        break from loop
                    otherwise (value = 257..285)
                        decode distance from input stream

                        move backwards distance bytes in the output
                        stream, and copy length bytes from this
                        position to the output stream.
            end loop
    while not last block

So here we have one more step left to display the data.

This needs a little more work to understand the PNG scan lines.

PNG allows for a number of filter methods. Only filter method 0 is defined by this International Standard. Filter method 0 provides a set of five filter types, and individual scanlines in each reduced image may use different filter types.

Type Name Filter Function Reconstruction Function
0 None Filt(x) = Orig(x) Recon(x) = Filt(x)
1 Sub Filt(x) = Orig(x) - Orig(a) Recon(x) = Filt(x) + Recon(a)
2 Up Filt(x) = Orig(x) - Orig(b) Recon(x) = Filt(x) + Recon(b)
3 Average Filt(x) = Orig(x) - floor((Orig(a) + Orig(b)) / 2) Recon(x) = Filt(x) + floor((Recon(a) + Recon(b)) / 2)
4 Paeth Filt(x) = Orig(x) - PaethPredictor(Orig(a), Orig(b), Orig(c)) Recon(x) = Filt(x) + PaethPredictor(Recon(a), Recon(b), Recon(c))
  • a the byte corresponding to x in the pixel immediately before the pixel containing x (or the byte immediately before x, when the bit depth is less than 8);
  • b the byte corresponding to x in the previous scanline;
  • c the byte corresponding to b in the pixel immediately before the pixel containing b (or the byte immediately before b, when the bit depth is less than 8).

For all filters, the bytes "to the left of" the first pixel in a scanline shall be treated as being zero. For filters that refer to the prior scanline, the entire prior scanline and bytes "to the left of" the first pixel in the prior scanline shall be treated as being zeroes for the first scanline of a reduced image.


void png_scanline(uint8_t *data, uint8_t *unfilter, uint8_t *preline, struct pic *p, uint8_t filtertype)
{
    switch(filtertype) {
    case 0://none
        for (int i = 0; i < p->pitch; i++) {
            unfilter[i] = data[i];
        }
        break;
    case 1://sub
        for (int i = 0; i < p->depth/8; i++) {
            unfilter[i] = data[i];
        }
        for (int i = p->depth/8; i < p->pitch; i++) {
            unfilter[i] = data[i] + unfilter[i-p->depth/8];
        }
        break;
    case 2://up
        if (preline) {
            for (int i = 0; i < p->pitch; i++) {
                unfilter[i] = data[i] + preline[i];
            }
        } else {
            for (int i = 0; i < p->pitch; i++) {
                unfilter[i] = data[i];
            }
        }
        break;
    case 3://average
        if (preline) {
            for (int i = 0; i < p->depth / 8; i++) {
                unfilter[i] = data[i] + preline[i] / 2;
            }
            for (int i = p->depth / 8; i < p->pitch; i++) {
                unfilter[i] = data[i] + (preline[i]+unfilter[i - p->depth / 8])/2;
            }
        } else {
            for (int i = 0; i < p->depth / 8; i++) {
                unfilter[i] = data[i];
            }
            for (int i = p->depth / 8; i < p->pitch; i++) {
                unfilter[i] = data[i]+ (unfilter[i-p->depth/8])/2;
            }
        }
        break;
    case 4://paeth
        if (preline) {
            for (int i = 0; i < p->depth / 8; i++) {
                unfilter[i] = data[i] + paeth_predictor(0, preline[i], 0);
            }
            for (int i = p->depth / 8; i < p->pitch; i++) {
                unfilter[i] = paeth_predictor(unfilter[i-p->depth/8], preline[i], preline[i-p->depth/8]);
            }
        } else {
            for (int i = 0; i < p->depth / 8; i++) {
                unfilter[i] = data[i];
            }
            for (int i = p->depth / 8; i < p->pitch; i++) {
                unfilter[i] = data[i]+paeth_predictor(unfilter[i-p->depth/8], 0, 0);
            }
        }
        break;
    }
}

void png_unfilter(uint8_t *data, struct pic *p)
{
    uint8_t *preline = NULL;
    for (int y = 0; y < p->height; y ++) {
        uint8_t filtertype = data[y * (p->pitch + 1)];
        uint8_t *line = data + y * (p->pitch + 1) + 1;
        uint8_t *unfilter = data + y * p->pitch;
        png_scanline(line, unfilter, preline, p, filtertype);
        preline = unfilter;
    }
}


struct pic *load_png(FILE *f)
{
    struct pic *p = malloc(sizeof(struct pic));
    uint8_t sig[8];
    fread(sig, 8, 1, f);
    // printf("%d %d %d %d %d %d %d %d\n", sig[0], sig[1], sig[2], sig[3], sig[4],
    //        sig[5], sig[6], sig[7]);
    struct png_ihdr ihdr;
    uint32_t gama;
    struct png_chrm chrm;
    union png_bkgd bkgd;
    uint8_t *compress = NULL;
    int data_len = 0;

    uint32_t length, chunk_type;
    fread(&length, 1, sizeof(uint32_t), f);
    fread(&chunk_type, 1, sizeof(uint32_t), f);
    length = ntohl(length);
    while (length && chunk_type != CHARS2UINT("IEND")) {

        printf("%c%c%c%c\n", (chunk_type)&0xFF, (chunk_type >> 8) & 0xFF,
               (chunk_type >> 16) & 0xFF, (chunk_type >> 24) & 0xFF);
        switch(chunk_type) {

            case CHARS2UINT("IHDR"):
                fread(&ihdr, sizeof(struct png_ihdr), 1, f);
                printf("width %d, height %d, depth %d\n", ntohl(ihdr.width), ntohl(ihdr.height), ihdr.depth);
                break;
            case CHARS2UINT("gAMA"):
                fread(&gama, 4, 1, f);
                break;
            case CHARS2UINT("cHRM"):
                fread(&chrm, sizeof(struct png_chrm), 1, f);
                break;
            case CHARS2UINT("bKGD"):
                if (ihdr.color_type == 0 || ihdr.color_type == 4) {
                    fread(&bkgd, 2, 1, f);
                } else if (ihdr.color_type == 2 || ihdr.color_type == 6) {
                    fread(&bkgd, 6, 1, f);
                } else if (ihdr.color_type == 3) {
                    fread(&bkgd, 2, 1, f);
                }
                break;
            // case CHARS2UINT("IEND"):
            //     break;
            case CHARS2UINT("IDAT"):
                data_len += length;
                if (!compress) {
                    compress = malloc(length);
                    fread(compress, length, 1, f);
                } else {
                    compress = realloc(compress, data_len);
                    fread(compress + data_len - length, length, 1, f);
                }
                break;
            case CHARS2UINT("tEXt"):
            case CHARS2UINT("iTXt"):
            case CHARS2UINT("zTXt"):
                fseek(f, length, SEEK_CUR);
                break;
            default:
                return NULL;
                break;
        }
        uint32_t crc;
        fread(&crc, 4, 1, f);
        fread(&length, 4, 1, f);
        fread(&chunk_type, 4, 1, f);
        length = ntohl(length);
    }
    uint32_t crc;
    fread(&crc, 4, 1, f);

    int clen;
    if (ihdr.color_type == 0 || ihdr.color_type == 4) {
        clen = 2;
    } else if (ihdr.color_type == 2 || ihdr.color_type == 6) {
        clen = 3;
    } else if (ihdr.color_type == 3) {
        clen = 1;
    }

    p->width = ntohl(ihdr.width);
    p->height = ntohl(ihdr.height);
    p->depth = ihdr.depth * clen;
    p->pitch = (((p->width + 3)>>2) << 2) * p->depth/8;
    p->format = SDL_PIXELFORMAT_RGB24;

    unsigned long dlen = p->pitch * p->height + p->height;
    uint8_t *data = malloc(dlen);

    int ret = decompress(compress, data_len, data, &dlen);
    if (ret != 0) {
        printf("ret is %d\n", ret);
    }
    printf("dlen %lu\n", dlen);
    free(compress);

    png_unfilter(data, p);

    p->data = data;
    return p;
}

Clone this wiki locally