pub trait FieldScheduler:
Send
+ Sync
+ Debug {
// Required methods
fn initialize<'a>(
&'a self,
filter: &'a FilterExpression,
context: &'a SchedulerContext,
) -> BoxFuture<'a, Result<()>>;
fn schedule_ranges<'a>(
&'a self,
ranges: &[Range<u64>],
filter: &FilterExpression,
) -> Result<Box<dyn SchedulingJob + 'a>>;
fn num_rows(&self) -> u64;
}
Expand description
A scheduler for a field’s worth of data
Each field in a reader’s output schema maps to one field scheduler. This scheduler may map to more than one column. For example, one field of struct data may cover many columns of child data. In fact, the entire file is treated as one top-level struct field.
The scheduler is responsible for calculating the necessary I/O. One schedule_range request could trigger multiple batches of I/O across multiple columns. The scheduler should emit decoders into the sink as quickly as possible.
As soon as the scheduler encounters a batch of data that can decoded then the scheduler should emit a decoder in the “unloaded” state. The decode stream will pull the decoder and start decoding.
The order in which decoders are emitted is important. Pages should be emitted in row-major order allowing decode of complete rows as quickly as possible.
The FieldScheduler
should be stateless and Send
and Sync
. This is
because it might need to be shared. For example, a list page has a reference to
the field schedulers for its items column. This is shared with the follow-up I/O
task created when the offsets are loaded.
See crate::decoder
for more information
Required Methods§
Sourcefn initialize<'a>(
&'a self,
filter: &'a FilterExpression,
context: &'a SchedulerContext,
) -> BoxFuture<'a, Result<()>>
fn initialize<'a>( &'a self, filter: &'a FilterExpression, context: &'a SchedulerContext, ) -> BoxFuture<'a, Result<()>>
Called at the beginning of scheduling to initialize the scheduler
Sourcefn schedule_ranges<'a>(
&'a self,
ranges: &[Range<u64>],
filter: &FilterExpression,
) -> Result<Box<dyn SchedulingJob + 'a>>
fn schedule_ranges<'a>( &'a self, ranges: &[Range<u64>], filter: &FilterExpression, ) -> Result<Box<dyn SchedulingJob + 'a>>
Schedules I/O for the requested portions of the field.
Note: ranges
must be ordered and non-overlapping
TODO: Support unordered or overlapping ranges in file scheduler