Expand description
Data layouts to represent encoded data in a sub-Arrow format
These DataBlock
structures represent physical layouts. They fill a gap somewhere
between [arrow_data::data::ArrayData
] (which, as a collection of buffers, is too
generic because it doesn’t give us enough information about what those buffers represent)
and arrow_array::array::Array
(which is too specific, because it cares about the
logical data type).
In addition, the layouts represented here are slightly stricter than Arrow’s layout rules. For example, offset buffers MUST start with 0. These additional restrictions impose a slight penalty on encode (to normalize arrow data) but make the development of encoders and decoders easier (since they can rely on a normalized representation)
Structs§
- AllNull
Data Block - A data block with no buffers where everything is null
- Block
Info - Constant
Data Block - A block representing the same constant value repeated many times
- Data
Block Builder - Dictionary
Data Block - A data block for dictionary encoded data
- Fixed
Size List Block - A data block to represent a fixed size list
- Fixed
Width Data Block - A data block for a single buffer of data where each element has a fixed number of bits
- Nullable
Data Block - Wraps a data block and adds nullability information to it
- Opaque
Block - A data block with no regular structure. There is no available spot to attach validity / repdef information and it cannot be converted to Arrow without being decoded
- Struct
Data Block - A data block representing a struct
- Variable
Width Block - A data block for variable-width data (e.g. strings, packed rows, etc.)
- Variable
Width Data Block Builder
Enums§
- Data
Block - A DataBlock is a collection of buffers that represents an “array” of data in very generic terms