[IPOL discuss] libpng and RGB->gray conversion

Nicolas Limare nicolas.limare at cmla.ens-cachan.fr
Thu Jul 28 16:14:16 CEST 2011


Here is (long) follow-up to the two following remarks.

> > while (ptr_gray < ptr_end)
> >             *ptr_gray++ = (unsigned char) ((6969.f * *ptr_r++
> >                                            + 23434.f * *ptr_g++
> >                                            + 2365.f * *ptr_b++) /
> > 32768.f);
> Why do you want to use float numbers? the R, G and B values are
> unsigned chars, (6969 R + 23434 G + 2365 B ) / 32768 will be performed
> in the internal integer representation (~ size_t), then casted. I
> don't see the need to use floating-point values.
> > Indeed, I think that to be 100% correct we should apply a round() function
> > before the (unsigned char) cast to avoid just truncating the value instead
> > of rounding it to the nearest integer.
> Maybe. But the result of an integer division is an integer, so there
> is nothing to round after the division. If we want sopmething exact,
> then the RGB->gray conversion must be done with float values, but I
> want to check if it makes sense, because the (6969 R + 23434 G + 2365
> B ) / 32768 formula already is an approximation of the Rec. 709
> weighting of RGB components[1], based on perception estimations. I'll
> check if roundoing instead of truncating makes sense.

sRGB defines the primary chomaticities of most (if not every)
monitor and printer used today; these chromaticities are identical
to the ones defined for HDTV in the Rec. 709 standard. This standard
provides a formula to compute the luminance of a sRGB signal:
    Y = 0.2126 R + 0.7152 G + 0.0722 B
Another formula is given with more precision in Charles Poynton's Color
    Y = 0.212671 R + 0.715160 G + 0.072169 B
I suppose the exact formula comes from a matrix inversion, and can't
be expressed with finite precision in decimal numbers.

The PNG file format can contain images with tree channels R, G and B,
and can handle arbitrary color spaces for these channel via its
cHRM or iCCP information chunks. sRGB standard was included in PNG
format specifications, and we expect that most (if not every) PNG file
we handle is expressed in the sRGB color space.

The standard png library libpng, used by io_png code, provides a
RGB->gray conversion function. This conversion computes the luminance
Y, and the result is similar to a picture taken with a black and white
camera instead of a color one. The default weights for the RGB
channels in this conversion are
    Y = (6969 R + 23434 G + 2365 B) / 32768
These values are an approximation of Charles Poynton's values and use
integers and a division by 32768=2^15, expected to be faster than
floating-point operations on most systems. The decimal version rounded
at 10^-6 is
    Y = 0.212677 R + 0.715149 G + 0.072174 B

This is implemented in io_png as
    img[i] = (unsigned char) ((6969 * img_r[i]
                               + 23434 * img_g[i]
                               + 2365 * img_b[i]) / 32768);

The C compiler behaviour for litteral variables is explained in "The C
Programming Language, 2nd edition" (K&R2), Kernighan & Ritchie,
Appendix A2.5.1 "Integer Constants", p 193: "The type of an integer
constant depends on its form, value and suffix[...] If it is
unsuffixed and decimal, it has the first of these types in which its
value can be represented: int, long int, unsigned long int."

So 6969, 23434, 2365 and 32768 are int values on almost every system
and are guaranteed to be within the long int range on any system (K&R2
Appendix B11 "Implementation-defined Limits", p 257). In the io_png
code, the RGB values are unsigned chars.

The C compiler behaviour for type conversions in K&R2, Charpter 2.7
"Type Conversions", p 44: "In general, if an operator like + or * that
takes two operators (a binary operator) has operands of different
types, the "lower" type is promoted to the "higher" type before the
operation proceeds. The result if of the higher type."

So the result of (6969 * R + 23434 * G + 2365 * B) is of int type, and
6969 * 255 + 23434 * 255 + 2365 * 255 = 8355840 < 2^31 so there is no
overflow. The integer division by 32768 also gives an integer. This
result is guaranteed to fit in an unsigned char variable because it is
the weighted sum of unsigned char values with the sum of weights equal
to 1, so the cast operation is performed without loss.

Globally, different approximations and errors happen during this
- converting RGB to luminance is an approximation of the perceptual
- the official Rec. 709 formula is an approximation of the conversion
  from the sRGB colors to CIE luminance; Poynton's weights are a
  better approximation of the same formulab
- floating-point computation involves rounding errorsat each operation
- the libpng integer formula is an integer approximation of Poynton's
- the libpng implementation, reused in io_png, uses truncation instead of

I compared the results for all the 256^3 possible RGB values in a 8bit
color image. My reference results were computed in float with Poynton's
weights, then rounded by
    x = (unsigned char) floor(x + .5)
With the io_png implementation
    y = (unsigned char) (6969 * r + 23434 * g + 2365 * b) / 32768
rearly 50% of the results are different and shifted by 1, because of
the truncation. With a modified implementation
    y = (unsigned char) floor((6969 * r + 23434 * g + 2365 * b) / 32768. + .5)
only 0.084% of the results are different, this time because of the
integer weights. So it seems to make sens to use rounding instead of
truncation. Instead of float computations and rounding, we can also use
integers and bitwise operators with the same result:
    y = 6969 * r + 23434 * g + 2365 * b
    y = (y / 32768) + ((y & 16384) / 16384)
    y = (y >> 15) + ((y & (1 << 14)) >> 14) // architecture-dependant?
But if we get to this point, I'm not sure using integer approximations
makes sense anymore, and we may as well use the floating-point poynton

A solution for us is to completely ignore this minor issue and decide
to handle RGB -> gray conversion using the libpng internal
function without questioning its quality, with the advantage that
images not encoded in the sRGB space will be correctly
converted. Another one is to decide we want the best possible
conversion and use floating-point computations, and carefully handle
the color space information from the PNG file.
I will make some timing tests to see if there is any benefit using
integers. And I will ask libpng implementers why they used an integer
approximation and truncation resulting in 50% wrong results.

- sRGB definition IEC 1966-2-1
- HDTV standard ITU Rec. 709
    sRGB -> Y conversion formula in on page 19
- Charles Poynton's Color FAQ
    sRGB -> Y conversion is explained in items 9. and 18.
- PNG file standards
    the color space information is defined in chapter 11.3.3
- libpng manual (version 1.4)
    RGB -> gray conversion is explained page 21
- libpng (version 1.5.2)
    the RGB -> gray conversion is done in pngrtran.c, line 3127
    the default conversion weights are defined in pngrtran.c, line 994
- io_png
  the RGB -> gray conversion is done in io_png.c, line 345

PS: In the libpng source code, I observed that the actual
    coefficients used by libpng are
    Y = (6968 R + 23434 G + 2366 B) / 32768
instead of the ones in the libpng documentation
    Y = (6969 R + 23434 G + 2365 B) / 32768
I will raise this issue on the libpng mailing-list.

Nicolas LIMARE - CMLA - ENS Cachan    http://www.cmla.ens-cachan.fr/~limare/
IPOL - image processing on line                          http://www.ipol.im/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://tools.ipol.im/mailman/archive/discuss/attachments/20110728/55047b01/attachment.pgp>

More information about the discuss mailing list