lance_datafusion::utils

Trait StreamingWriteSource

Source
pub trait StreamingWriteSource: Send {
    // Required methods
    fn arrow_schema(&self) -> SchemaRef;
    fn into_stream(self) -> SendableRecordBatchStream;

    // Provided method
    fn into_stream_and_schema<'async_trait>(
        self,
    ) -> Pin<Box<dyn Future<Output = Result<(SendableRecordBatchStream, Schema)>> + Send + 'async_trait>>
       where Self: Sized + 'async_trait { ... }
}
Expand description

A trait for [BatchRecord] iterators, readers and streams that can be converted to a concrete stream type SendableRecordBatchStream.

This also cam read the schema from the first batch and then update the schema to reflect the dictionary columns.

Required Methods§

Source

fn arrow_schema(&self) -> SchemaRef

Returns the arrow schema.

Source

fn into_stream(self) -> SendableRecordBatchStream

Convert to a stream.

The conversion will be conducted in a background thread.

Provided Methods§

Source

fn into_stream_and_schema<'async_trait>( self, ) -> Pin<Box<dyn Future<Output = Result<(SendableRecordBatchStream, Schema)>> + Send + 'async_trait>>
where Self: Sized + 'async_trait,

Infer the Lance schema from the first batch stream.

This will peek the first batch to get the dictionaries for dictionary columns.

NOTE: this does not validate the schema. For example, for appends the schema should be checked to make sure it matches the existing dataset schema before writing.

Implementations on Foreign Types§

Source§

impl StreamingWriteSource for Box<dyn RecordBatchReader + Send>

Source§

impl StreamingWriteSource for ArrowArrayStreamReader

Source§

impl StreamingWriteSource for SendableRecordBatchStream

Source§

impl<I> StreamingWriteSource for RecordBatchIterator<I>
where Self: Send, I: IntoIterator<Item = Result<RecordBatch, ArrowError>> + Send + 'static,

Source§

impl<T> StreamingWriteSource for Box<T>

Implementors§