lance_encoding::format::pb

Struct List

Source
pub struct List {
    pub offsets: Option<Box<ArrayEncoding>>,
    pub null_offset_adjustment: u64,
    pub num_items: u64,
}
Expand description

An array encoding for variable-length list fields

Fields§

§offsets: Option<Box<ArrayEncoding>>

An array containing the offsets into an items array.

This array will have num_rows items and will never have nulls.

If the list at index i is not null then offsets[i] will contain base + len(list) where base is defined as: i == 0: 0 i > 0: (offsets[i-1] % null_offset_adjustment)

To help understand we can consider the following example list: [ [A, B], null, [], [C, D, E] ]

The offsets will be [2, ?, 2, 5]

If the incoming list at index i IS null then offsets[i] will contain base + len(list) + null_offset_adjustment where base is defined the same as above.

To complete the above example let’s assume that null_offset_adjustment is 7. Then the offsets will be [2, 9, 2, 5]

If there are no nulls then the offsets we write here are exactly the same as the offsets in an Arrow list array (except we omit the leading 0 which is redundant)

The reason we do this is so that reading a single list at index i only requires us to load the indices at i and i-1.

If the offset at index i is greater than `null_offset_adjustment`` then the list at index i is null.

Otherwise the length of the list is offsets\[i\] - base where base is defined the same as above.

Let’s consider our example offsets: [2, 9, 2, 5]

We can take any range of lists and determine how many list items are referenced by the sublist.

0..3: [, 5] -> items 0..5 (base = 0* and end is 5) 0..2: [, 2] -> items 0..2 (base = 0* and end is 2) 0..1: [_, 9] -> items 0..2 (base = 0* and end is 9 % 7) 1..3: [2, 5] -> items 2..5 (base = 2 and end is 5) 1..2: [2, 2] -> items 2..2 (base = 2 and end is 2) 2..3: [9, 5] -> items 2..5 (base = 9 % 7 and end is 5)

  • When the start of our range is the 0th item the base is always 0 and we only need to load a single index from disk to determine the range.

The data type of the offsets array is flexible and does not need to match the data type of the destination array. Please note that the offsets array is very likely to be efficiently encoded by bit packing deltas.

§null_offset_adjustment: u64

If a list is null then we add this value to the offset

This value must be greater than the length of the items so that (offset + null_offset_adjustment) is never used by a non-null list.

Note that this value cannot be equal to the length of the items because then a page with a single list would store [ X ] and we couldn’t know if that is a null list or a list with X items.

Therefore, the best choice for this value is 1 + # of items. Choosing this will maximize the bit packing that we can apply to the offsets.

§num_items: u64

How many items are referenced by these offsets. This is needed in order to determine which items pages map to this offsets page.

Trait Implementations§

Source§

impl Clone for List

Source§

fn clone(&self) -> List

Returns a copy of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for List

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl Default for List

Source§

fn default() -> Self

Returns the “default value” for a type. Read more
Source§

impl Message for List

Source§

fn encoded_len(&self) -> usize

Returns the encoded length of the message without a length delimiter.
Source§

fn clear(&mut self)

Clears the message, resetting all fields to their default.
Source§

fn encode(&self, buf: &mut impl BufMut) -> Result<(), EncodeError>
where Self: Sized,

Encodes the message to a buffer. Read more
Source§

fn encode_to_vec(&self) -> Vec<u8>
where Self: Sized,

Encodes the message to a newly allocated buffer.
Source§

fn encode_length_delimited( &self, buf: &mut impl BufMut, ) -> Result<(), EncodeError>
where Self: Sized,

Encodes the message with a length-delimiter to a buffer. Read more
Source§

fn encode_length_delimited_to_vec(&self) -> Vec<u8>
where Self: Sized,

Encodes the message with a length-delimiter to a newly allocated buffer.
Source§

fn decode(buf: impl Buf) -> Result<Self, DecodeError>
where Self: Default,

Decodes an instance of the message from a buffer. Read more
Source§

fn decode_length_delimited(buf: impl Buf) -> Result<Self, DecodeError>
where Self: Default,

Decodes a length-delimited instance of the message from the buffer.
Source§

fn merge(&mut self, buf: impl Buf) -> Result<(), DecodeError>
where Self: Sized,

Decodes an instance of the message from a buffer, and merges it into self. Read more
Source§

fn merge_length_delimited(&mut self, buf: impl Buf) -> Result<(), DecodeError>
where Self: Sized,

Decodes a length-delimited instance of the message from buffer, and merges it into self.
Source§

impl Name for List

Source§

const NAME: &'static str = "List"

Simple name for this Message. This name is the same as it appears in the source .proto file, e.g. FooBar.
Source§

const PACKAGE: &'static str = "lance.encodings"

Package name this message type is contained in. They are domain-like and delimited by ., e.g. google.protobuf.
Source§

fn full_name() -> String

Fully-qualified unique name for this Message. It’s prefixed with the package name and names of any parent messages, e.g. google.rpc.BadRequest.FieldViolation. By default, this is the package name followed by the message name. Fully-qualified names must be unique within a domain of Type URLs.
Source§

fn type_url() -> String

Type URL for this Message, which by default is the full name with a leading slash, but may also include a leading domain name, e.g. type.googleapis.com/google.profile.Person. This can be used when serializing into the google.protobuf.Any type.
Source§

impl PartialEq for List

Source§

fn eq(&self, other: &List) -> bool

Tests for self and other values to be equal, and is used by ==.
1.0.0 · Source§

fn ne(&self, other: &Rhs) -> bool

Tests for !=. The default implementation is almost always sufficient, and should not be overridden without very good reason.
Source§

impl StructuralPartialEq for List

Auto Trait Implementations§

§

impl Freeze for List

§

impl RefUnwindSafe for List

§

impl Send for List

§

impl Sync for List

§

impl Unpin for List

§

impl UnwindSafe for List

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dst: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dst. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T> Instrument for T

Source§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more
Source§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T> Pointable for T

Source§

const ALIGN: usize

The alignment of pointer.
Source§

type Init = T

The type for initializers.
Source§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
Source§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
Source§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
Source§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source§

fn vzip(self) -> V

Source§

impl<T> WithSubscriber for T

Source§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

impl<T> Allocation for T
where T: RefUnwindSafe + Send + Sync,

Source§

impl<T> ErasedDestructor for T
where T: 'static,

Source§

impl<T> MaybeSendSync for T