pub struct PrimitiveStructuralEncoder { /* private fields */ }
Expand description
An encoder for primitive (leaf) arrays
This encoder is fairly complicated and follows a number of paths depending on the data.
First, we convert the validity & offsets information into repetition and definition levels. Then we compress the data itself into a single buffer.
If the data is narrow then we encode the data in small chunks (each chunk should be a few disk sectors and contains a buffer of repetition, a buffer of definition, and a buffer of value data). This approach is called “mini-block”. These mini-blocks are stored into a single data buffer.
If the data is wide then we zip together the repetition and definition value with the value data into a single buffer. This approach is called “zipped”.
If there is any repetition information then we create a repetition index (TODO)
In addition, the compression process may create zero or more metadata buffers. For example, a dictionary compression will create dictionary metadata. Any mini-block approach has a metadata buffer of block sizes. This metadata is stored in a separate buffer on disk and read at initialization time.
TODO: We should concatenate metadata buffers from all pages into a single buffer at (roughly) the end of the file so there is, at most, one read per column of metadata per file.
Implementations§
Source§impl PrimitiveStructuralEncoder
impl PrimitiveStructuralEncoder
pub fn try_new( options: &EncodingOptions, compression_strategy: Arc<dyn CompressionStrategy>, column_index: u32, field: Field, ) -> Result<Self>
Trait Implementations§
Source§impl FieldEncoder for PrimitiveStructuralEncoder
impl FieldEncoder for PrimitiveStructuralEncoder
Source§fn maybe_encode(
&mut self,
array: ArrayRef,
_external_buffers: &mut OutOfLineBuffers,
repdef: RepDefBuilder,
row_number: u64,
num_rows: u64,
) -> Result<Vec<EncodeTask>>
fn maybe_encode( &mut self, array: ArrayRef, _external_buffers: &mut OutOfLineBuffers, repdef: RepDefBuilder, row_number: u64, num_rows: u64, ) -> Result<Vec<EncodeTask>>
Source§fn flush(
&mut self,
_external_buffers: &mut OutOfLineBuffers,
) -> Result<Vec<EncodeTask>>
fn flush( &mut self, _external_buffers: &mut OutOfLineBuffers, ) -> Result<Vec<EncodeTask>>
Source§fn num_columns(&self) -> u32
fn num_columns(&self) -> u32
Source§fn finish(
&mut self,
_external_buffers: &mut OutOfLineBuffers,
) -> BoxFuture<'_, Result<Vec<EncodedColumn>>>
fn finish( &mut self, _external_buffers: &mut OutOfLineBuffers, ) -> BoxFuture<'_, Result<Vec<EncodedColumn>>>
Auto Trait Implementations§
impl Freeze for PrimitiveStructuralEncoder
impl !RefUnwindSafe for PrimitiveStructuralEncoder
impl Send for PrimitiveStructuralEncoder
impl Sync for PrimitiveStructuralEncoder
impl Unpin for PrimitiveStructuralEncoder
impl !UnwindSafe for PrimitiveStructuralEncoder
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self
into a Left
variant of Either<Self, Self>
if into_left
is true
.
Converts self
into a Right
variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self
into a Left
variant of Either<Self, Self>
if into_left(&self)
returns true
.
Converts self
into a Right
variant of Either<Self, Self>
otherwise. Read more