lance_encoding::encoder

Trait VariablePerValueCompressor

Source
pub trait VariablePerValueCompressor:
    Debug
    + Send
    + Sync {
    // Required method
    fn compress(
        &self,
        data: DataBlock,
    ) -> Result<(VariableWidthBlock, ArrayEncoding)>;
}
Expand description

Trait for compression algorithms that are suitable for use in the zipped structural encoding

This encoding is useful for non-short strings, binary, and variable length lists (i.e. when the average value is >= 128 bytes)

These compressors can be extremely generic. They only need to produce one buffer of bytes and another buffer of offsets into the bytes, one offset for each value. Both of these buffers will be stored.

Note: It is perfectly legal for a value to have 0 bytes. However, we still need to store the offset itself. This means that this compressor, when implemented by something like RLE will not be as efficient (space-wise) as a block version (which could skip the offsets for runs).

Accessing this data will require 2 IOPS and accessing in a random-access fashion will require a repetition index.

Required Methods§

Source

fn compress( &self, data: DataBlock, ) -> Result<(VariableWidthBlock, ArrayEncoding)>

Compress the data into a single buffer where each value is encoded with a different size

Also returns a description of the compression that can be used to decompress when reading the data back

Implementors§