polars_parquet::parquet::encoding::hybrid_rle::gatherer

Trait Translator

pub trait Translator<O> {
    // Required method
    fn translate(&self, value: u32) -> ParquetResult<O>;

    // Provided methods
    fn translate_slice(
        &self,
        target: &mut Vec<O>,
        source: &[u32],
    ) -> ParquetResult<()> { ... }
    fn translate_chunk(
        &self,
        target: &mut Vec<O>,
        source: &<u32 as Unpackable>::Unpacked,
    ) -> ParquetResult<()> { ... }
    fn translate_bitpacked_all(
        &self,
        target: &mut Vec<O>,
        decoder: Decoder<'_, u32>,
    ) -> ParquetResult<()> { ... }
    fn translate_bitpacked_limited<'a>(
        &self,
        target: &mut Vec<O>,
        decoder: Decoder<'a, u32>,
        limit: usize,
    ) -> ParquetResult<BufferedBitpacked<'a>> { ... }
    fn translate_bitpacked<'a>(
        &self,
        target: &mut Vec<O>,
        decoder: Decoder<'a, u32>,
        limit: Option<usize>,
    ) -> ParquetResult<(usize, Option<HybridRleBuffered<'a>>)> { ... }
}

Expand description

A trait to describe a translation from a HybridRLE encoding to an another format.

In essence, this is one method (Translator::translate) that maps an u32 to the desired output type O. There are several other methods that may provide optimized routines for slices, chunks and decoders.

§Motivation

The HybridRleDecoder is used extensively during Parquet decoding because it is used for Dremel decoding and dictionary decoding. We want to perform a transformation from this space-efficient encoding to a buffer. Here, items might be skipped, might be mapped and only a few items might be needed. There are 3 main ways to do this.

Element-by-element translation using iterator map, filter, skip, etc. This suffers from the problem that is difficult to SIMD the translation and that a collect might need to constantly poll the next function. Next to that monomorphization might need to generate many, many variants.
Buffer most everything, filter and translate later. This has high memory-consumption and might suffer from cache-eviction problems. This is computationally the most efficient, but probably still has a high runtime. Also, this fails to utilize run-length information and needs to retranslate all repeated elements.
Batched operations. Here, we try to utilize the run-length information and utilize SIMD to process many bitpacked items. This can provide the best of both worlds.

The HybridRleDecoder decoders utilizing both run-length encoding and bitpacking. In both processes, this Translator trait allows for translation with (i) no heap allocations and (ii) cheap buffering and can stop and start at any point. Consequently, the memory consumption while doing these translations can be relatively low while still processing items in batches.

Required Methods§

Source

fn translate(&self, value: u32) -> ParquetResult<O>

Translate from a decoded value to the output format