Function fast_srgb8::f32_to_srgb8 [−][src]
pub fn f32_to_srgb8(f: f32) -> u8
Converts linear f32 RGB component to an 8-bit sRGB value.
If you have to do this for many values simultaneously, use
f32x4_to_srgb8
, which will compute 4 results at once (using SIMD
instructions if available).
Input less than 0.0, or greater than 1.0, is clamped to be inside that range. NaN input is treated as identical to 0.0.
Details
Conceptually, this is an optimized (and slightly approximated — see the “Approximation” section below) version of the following “reference implementation”, which more or less looks like:
// Conceptually equivalent (but see below) fn to_srgb_reference(f: f32) -> u8 { let v = if !(f > 0.0) { 0.0 } else if f <= 0.0031308 { 12.92 * f } else if f < 1.0 { 1.055 * f.powf(1.0 / 2.4) - 0.055 } else { 1.0 }; (v * 255.0 + 0.5) as u8 }
This crate’s implementation uses a small lookup table (a [u32; 104]
–
around 6.5 cache lines), and avoids needing to call powf
(which, as an
added bonus, means it works great in no_std
), and in practice is many
times faster than the alternative.
Additional, it’s fairly amenable to implementing in SIMD (— everything is
easily parallelized aside from the table lookup), and so a 4-wide
implementation is also provided as f32x4_to_srgb8
Approximation
Note that this is not bitwise identical to the results of the
to_srgb_reference
function above, it’s just very close. The maximum error
is 0.544403 for an input of 0.31152344, where error is computed as the
absolute difference between the rounded integer and the “exact” value.
This almost certainly meets requirements for graphics: The DirectX spec mandates that compliant implementations of this function have a maximum error of less than “0.6 ULP on the integer side” — Ours is ~0.54, which is within the requirement.
This means function is probably at least as accurate as whatever your GPU driver and/or hardware does for sRGB framebuffers and such — very likely even if it isn’t using DirectX (it’s spec tends to be descriptive of what’s available commonly, especially in cases like this (most cases) where it’s the only one that bothers to put a requirement).
Additionally, because this function converts the result u8
— for the vast
majority of inputs it will return an identical result to the reference impl.
To be completely clear (since it was brought up as a concern): despite this
approximation, this function and srgb8_to_f32
are inverses of eachother,
and round trip appropriately.