pub struct ReaderProjection {
pub schema: Arc<Schema>,
pub column_indices: Vec<u32>,
}
Expand description
Selecting columns from a lance file requires specifying both the index of the column and the data type of the column
Partly, this is because it is not strictly required that columns be read into the same type. For example, a string column may be read as a string, large_string or string_view type.
A read will only succeed if the decoder for a column is capable of decoding into the requested type.
Note that this should generally be limited to different in-memory representations of the same semantic type. An encoding could theoretically support “casting” (e.g. int to string, etc.) but there is little advantage in doing so here.
Note: in order to specify a projection the user will need some way to figure out the column indices. In the table format we do this using field IDs and keeping track of the field id->column index mapping.
If users are not using the table format then they will need to figure out some way to do this themselves.
Fields§
§schema: Arc<Schema>
The data types (schema) of the selected columns. The names of the schema are arbitrary and ignored.
column_indices: Vec<u32>
The indices of the columns to load.
The mapping should be as follows:
- Primitive: the index of the column in the schema
- List: the index of the list column in the schema followed by the column indices of the children
- FixedSizeList (of primitive): the index of the column in the schema (this case is not nested)
- FixedSizeList (of non-primitive): not yet implemented
- Dictionary: same as primitive
- Struct: the index of the struct column in the schema followed by the column indices of the children
In other words, this should be a DFS listing of the desired schema.
For example, if the goal is to load:
x: int32
y: struct<z: int32, w: string>
z: list
and the schema originally used to store the data was:
a: struct<x: int32>
b: int64
y: struct<z: int32, c: int64, w: string>
z: list
Then the column_indices should be [1, 3, 4, 6, 7, 8]
Implementations§
Source§impl ReaderProjection
impl ReaderProjection
Sourcepub fn from_field_ids(
reader: &FileReader,
schema: &Schema,
field_id_to_column_index: &BTreeMap<u32, u32>,
) -> Result<Self>
pub fn from_field_ids( reader: &FileReader, schema: &Schema, field_id_to_column_index: &BTreeMap<u32, u32>, ) -> Result<Self>
Creates a projection using a mapping from field IDs to column indices
You can obtain such a mapping when the file is written using the
crate::v2::writer::FileWriter::field_id_to_column_indices
method.
Sourcepub fn from_whole_schema(schema: &Schema, version: LanceFileVersion) -> Self
pub fn from_whole_schema(schema: &Schema, version: LanceFileVersion) -> Self
Creates a projection that reads the entire file
If the schema provided is not the schema of the entire file then
the projection will be invalid and the read will fail.
If the field is a struct datatype
with packed
set to true in the field metadata,
the whole struct has one column index.
To support nested packed-struct encoding
, this method need to be further adjusted.
Sourcepub fn from_column_names(schema: &Schema, column_names: &[&str]) -> Result<Self>
pub fn from_column_names(schema: &Schema, column_names: &[&str]) -> Result<Self>
Creates a projection that reads the specified columns provided by name
The syntax for column names is the same as lance_core::datatypes::Schema::project
If the schema provided is not the schema of the entire file then the projection will be invalid and the read will fail.
Trait Implementations§
Source§impl Clone for ReaderProjection
impl Clone for ReaderProjection
Source§fn clone(&self) -> ReaderProjection
fn clone(&self) -> ReaderProjection
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source
. Read moreAuto Trait Implementations§
impl Freeze for ReaderProjection
impl !RefUnwindSafe for ReaderProjection
impl Send for ReaderProjection
impl Sync for ReaderProjection
impl Unpin for ReaderProjection
impl !UnwindSafe for ReaderProjection
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self
into a Left
variant of Either<Self, Self>
if into_left
is true
.
Converts self
into a Right
variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self
into a Left
variant of Either<Self, Self>
if into_left(&self)
returns true
.
Converts self
into a Right
variant of Either<Self, Self>
otherwise. Read more