pub trait PrimitivePageDecoder: Send + Sync {
// Required method
fn decode(&self, rows_to_skip: u64, num_rows: u64) -> Result<DataBlock>;
}
Expand description
A decoder for single-column encodings of primitive data (this includes fixed size lists of primitive data)
Physical decoders are able to decode into existing buffers for zero-copy operation.
Instances should be stateless and Send
/ Sync
. This is because multiple decode
tasks could reference the same page. For example, imagine a page covers rows 0-2000
and the decoder stream has a batch size of 1024. The decoder will be needed by both
the decode task for batch 0 and the decode task for batch 1.
See crate::decoder
for more information
Required Methods§
Sourcefn decode(&self, rows_to_skip: u64, num_rows: u64) -> Result<DataBlock>
fn decode(&self, rows_to_skip: u64, num_rows: u64) -> Result<DataBlock>
Decode data into buffers
This may be a simple zero-copy from a disk buffer or could involve complex decoding such as decompressing from some compressed representation.
Capacity is stored as a tuple of (num_bytes: u64, is_needed: bool). The is_needed
portion only needs to be updated if the encoding has some concept of an “optional”
buffer.
Encodings can have any number of input or output buffers. For example, a dictionary decoding will convert two buffers (indices + dictionary) into a single buffer
Binary decodings have two output buffers (one for values, one for offsets)
Other decodings could even expand the # of output buffers. For example, we could decode fixed size strings into variable length strings going from one input buffer to multiple output buffers.
Each Arrow data type typically has a fixed structure of buffers and the encoding chain will generally end at one of these structures. However, intermediate structures may exist which do not correspond to any Arrow type at all. For example, a bitpacking encoding will deal with buffers that have bits-per-value that is not a multiple of 8.
The primitive_array_from_buffers
method has an expected buffer layout for each arrow
type (order matters) and encodings that aim to decode into arrow types should respect
this layout.
§Arguments
rows_to_skip
- how many rows to skip (within the page) before decodingnum_rows
- how many rows to decodeall_null
- A mutable bool, set to true if a decoder determines all values are null