datafusion_physical_plan::execution_plan

Trait ExecutionPlan

pub trait ExecutionPlan:
    Debug
    + DisplayAs
    + Send
    + Sync {
Show 19 methods    // Required methods
    fn name(&self) -> &str;
    fn as_any(&self) -> &dyn Any;
    fn properties(&self) -> &PlanProperties;
    fn children(&self) -> Vec<&Arc<dyn ExecutionPlan>>;
    fn with_new_children(
        self: Arc<Self>,
        children: Vec<Arc<dyn ExecutionPlan>>,
    ) -> Result<Arc<dyn ExecutionPlan>>;
    fn execute(
        &self,
        partition: usize,
        context: Arc<TaskContext>,
    ) -> Result<SendableRecordBatchStream>;

    // Provided methods
    fn static_name() -> &'static str
       where Self: Sized { ... }
    fn schema(&self) -> SchemaRef { ... }
    fn required_input_distribution(&self) -> Vec<Distribution> { ... }
    fn required_input_ordering(&self) -> Vec<Option<LexRequirement>> { ... }
    fn maintains_input_order(&self) -> Vec<bool> { ... }
    fn benefits_from_input_partitioning(&self) -> Vec<bool> { ... }
    fn repartitioned(
        &self,
        _target_partitions: usize,
        _config: &ConfigOptions,
    ) -> Result<Option<Arc<dyn ExecutionPlan>>> { ... }
    fn metrics(&self) -> Option<MetricsSet> { ... }
    fn statistics(&self) -> Result<Statistics> { ... }
    fn supports_limit_pushdown(&self) -> bool { ... }
    fn with_fetch(
        &self,
        _limit: Option<usize>,
    ) -> Option<Arc<dyn ExecutionPlan>> { ... }
    fn fetch(&self) -> Option<usize> { ... }
    fn cardinality_effect(&self) -> CardinalityEffect { ... }
}

Expand description

Represent nodes in the DataFusion Physical Plan.

Calling execute produces an async SendableRecordBatchStream of RecordBatch that incrementally computes a partition of the ExecutionPlan’s output from its input. See Partitioning for more details on partitioning.

Methods such as Self::schema and Self::properties communicate properties of the output to the DataFusion optimizer, and methods such as required_input_distribution and required_input_ordering express requirements of the ExecutionPlan from its input.

ExecutionPlan can be displayed in a simplified form using the return value from displayable in addition to the (normally quite verbose) Debug output.

Required Methods§

Source

fn name(&self) -> &str

Short name for the ExecutionPlan, such as ‘ParquetExec’.

Implementation note: this method can just proxy to static_name if no special action is needed. It doesn’t provide a default implementation like that because this method doesn’t require the Sized constrain to allow a wilder range of use cases.

Source

fn as_any(&self) -> &dyn Any

Returns the execution plan as Any so that it can be downcast to a specific implementation.

Source

fn properties(&self) -> &PlanProperties

Return properties of the output of the ExecutionPlan, such as output ordering(s), partitioning information etc.

This information is available via methods on ExecutionPlanProperties trait, which is implemented for all ExecutionPlans.

Source

fn children(&self) -> Vec<&Arc<dyn ExecutionPlan>>

Get a list of children ExecutionPlans that act as inputs to this plan. The returned list will be empty for leaf nodes such as scans, will contain a single value for unary nodes, or two values for binary nodes (such as joins).

Source

fn with_new_children( self: Arc<Self>, children: Vec<Arc<dyn ExecutionPlan>>, ) -> Result<Arc<dyn ExecutionPlan>>

Returns a new ExecutionPlan where all existing children were replaced by the children, in order

Source

fn execute( &self, partition: usize, context: Arc<TaskContext>, ) -> Result<SendableRecordBatchStream>

Begin execution of partition, returning a Stream of RecordBatches.

§Notes

The execute method itself is not async but it returns an async futures::stream::Stream. This Stream should incrementally compute the output, RecordBatch by RecordBatch (in a streaming fashion). Most ExecutionPlans should not do any work before the first RecordBatch is requested from the stream.

RecordBatchStreamAdapter can be used to convert an async Stream into a SendableRecordBatchStream.

Using async Streams allows for network I/O during execution and takes advantage of Rust’s built in support for async continuations and crate ecosystem.

§Error handling

Any error that occurs during execution is sent as an Err in the output stream.

ExecutionPlan implementations in DataFusion cancel additional work immediately once an error occurs. The rationale is that if the overall query will return an error, any additional work such as continued polling of inputs will be wasted as it will be thrown away.

§Cancellation / Aborting Execution

The Stream that is returned must ensure that any allocated resources are freed when the stream itself is dropped. This is particularly important for spawned tasks or threads. Unless care is taken to “abort” such tasks, they may continue to consume resources even after the plan is dropped, generating intermediate results that are never used. Thus, spawn is disallowed, and instead use SpawnedTask.

For more details see SpawnedTask, JoinSet and RecordBatchReceiverStreamBuilder for structures to help ensure all background tasks are cancelled.

§Implementation Examples

While async Streams have a non trivial learning curve, the futures crate provides StreamExt and TryStreamExt which help simplify many common operations.

Here are some common patterns:

§Return Precomputed `RecordBatch`

We can return a precomputed RecordBatch as a Stream:

struct MyPlan {
    batch: RecordBatch,
}

impl MyPlan {
    fn execute(
        &self,
        partition: usize,
        context: Arc<TaskContext>
    ) -> Result<SendableRecordBatchStream> {
        // use functions from futures crate convert the batch into a stream
        let fut = futures::future::ready(Ok(self.batch.clone()));
        let stream = futures::stream::once(fut);
        Ok(Box::pin(RecordBatchStreamAdapter::new(self.batch.schema(), stream)))
    }
}

§Lazily (async) Compute `RecordBatch`

We can also lazily compute a RecordBatch when the returned Stream is polled

struct MyPlan {
    schema: SchemaRef,
}

/// Returns a single batch when the returned stream is polled
async fn get_batch() -> Result<RecordBatch> {
    todo!()
}

impl MyPlan {
    fn execute(
        &self,
        partition: usize,
        context: Arc<TaskContext>
    ) -> Result<SendableRecordBatchStream> {
        let fut = get_batch();
        let stream = futures::stream::once(fut);
        Ok(Box::pin(RecordBatchStreamAdapter::new(self.schema.clone(), stream)))
    }
}

§Lazily (async) create a Stream

If you need to create the return Stream using an async function, you can do so by flattening the result:

struct MyPlan {
    schema: SchemaRef,
}

/// async function that returns a stream
async fn get_batch_stream() -> Result<SendableRecordBatchStream> {
    todo!()
}

impl MyPlan {
    fn execute(
        &self,
        partition: usize,
        context: Arc<TaskContext>
    ) -> Result<SendableRecordBatchStream> {
        // A future that yields a stream
        let fut = get_batch_stream();
        // Use TryStreamExt::try_flatten to flatten the stream of streams
        let stream = futures::stream::once(fut).try_flatten();
        Ok(Box::pin(RecordBatchStreamAdapter::new(self.schema.clone(), stream)))
    }
}

Provided Methods§

Source

fn static_name() -> &'static str
where Self: Sized,

Short name for the ExecutionPlan, such as ‘ParquetExec’. Like name but can be called without an instance.

Source

fn schema(&self) -> SchemaRef

Get the schema for this execution plan

Source

fn required_input_distribution(&self) -> Vec<Distribution>

Specifies the data distribution requirements for all the children for this ExecutionPlan, By default it’s [Distribution::UnspecifiedDistribution] for each child,

Source

fn required_input_ordering(&self) -> Vec<Option<LexRequirement>>

Specifies the ordering required for all of the children of this ExecutionPlan.

For each child, it’s the local ordering requirement within each partition rather than the global ordering

NOTE that checking !is_empty() does not check for a required input ordering. Instead, the correct check is that at least one entry must be Some

Source

fn maintains_input_order(&self) -> Vec<bool>

Returns false if this ExecutionPlan’s implementation may reorder rows within or between partitions.

For example, Projection, Filter, and Limit maintain the order of inputs – they may transform values (Projection) or not produce the same number of rows that went in (Filter and Limit), but the rows that are produced go in the same way.

DataFusion uses this metadata to apply certain optimizations such as automatically repartitioning correctly.

The default implementation returns false

WARNING: if you override this default, you MUST ensure that the ExecutionPlan’s maintains the ordering invariant or else DataFusion may produce incorrect results.

Source

fn benefits_from_input_partitioning(&self) -> Vec<bool>

Specifies whether the ExecutionPlan benefits from increased parallelization at its input for each child.

If returns true, the ExecutionPlan would benefit from partitioning its corresponding child (and thus from more parallelism). For ExecutionPlan that do very little work the overhead of extra parallelism may outweigh any benefits

The default implementation returns true unless this ExecutionPlan has signalled it requires a single child input partition.

Source

fn repartitioned( &self, _target_partitions: usize, _config: &ConfigOptions, ) -> Result<Option<Arc<dyn ExecutionPlan>>>

If supported, attempt to increase the partitioning of this ExecutionPlan to produce target_partitions partitions.

If the ExecutionPlan does not support changing its partitioning, returns Ok(None) (the default).

It is the ExecutionPlan can increase its partitioning, but not to the target_partitions, it may return an ExecutionPlan with fewer partitions. This might happen, for example, if each new partition would be too small to be efficiently processed individually.

The DataFusion optimizer attempts to use as many threads as possible by repartitioning its inputs to match the target number of threads available (target_partitions). Some data sources, such as the built in CSV and Parquet readers, implement this method as they are able to read from their input files in parallel, regardless of how the source data is split amongst files.

Source

fn metrics(&self) -> Option<MetricsSet>

Return a snapshot of the set of Metrics for this ExecutionPlan. If no Metrics are available, return None.

While the values of the metrics in the returned MetricsSets may change as execution progresses, the specific metrics will not.

Once self.execute() has returned (technically the future is resolved) for all available partitions, the set of metrics should be complete. If this function is called prior to execute() new metrics may appear in subsequent calls.

Source

fn statistics(&self) -> Result<Statistics>

Returns statistics for this ExecutionPlan node. If statistics are not available, should return Statistics::new_unknown (the default), not an error.

For TableScan executors, which supports filter pushdown, special attention needs to be paid to whether the stats returned by this method are exact or not

Source

fn supports_limit_pushdown(&self) -> bool

Returns true if a limit can be safely pushed down through this ExecutionPlan node.

If this method returns true, and the query plan contains a limit at the output of this node, DataFusion will push the limit to the input of this node.

Source

fn with_fetch(&self, _limit: Option<usize>) -> Option<Arc<dyn ExecutionPlan>>

Returns a fetching variant of this ExecutionPlan node, if it supports fetch limits. Returns None otherwise.

Source

fn fetch(&self) -> Option<usize>

Gets the fetch count for the operator, None means there is no fetch.

Source

fn cardinality_effect(&self) -> CardinalityEffect

Gets the effect on cardinality, if known

Trait ExecutionPlanCopy item path

Required Methods§

fn name(&self) -> &str

fn as_any(&self) -> &dyn Any

fn properties(&self) -> &PlanProperties

fn children(&self) -> Vec<&Arc<dyn ExecutionPlan>>

fn with_new_children( self: Arc<Self>, children: Vec<Arc<dyn ExecutionPlan>>, ) -> Result<Arc<dyn ExecutionPlan>>

fn execute( &self, partition: usize, context: Arc<TaskContext>, ) -> Result<SendableRecordBatchStream>

§Notes

§Error handling

§Cancellation / Aborting Execution

§Implementation Examples

§Return Precomputed RecordBatch

§Lazily (async) Compute RecordBatch

§Lazily (async) create a Stream

Provided Methods§

fn static_name() -> &'static strwhere Self: Sized,

fn schema(&self) -> SchemaRef

fn required_input_distribution(&self) -> Vec<Distribution>

fn required_input_ordering(&self) -> Vec<Option<LexRequirement>>

fn maintains_input_order(&self) -> Vec<bool>

fn benefits_from_input_partitioning(&self) -> Vec<bool>

fn repartitioned( &self, _target_partitions: usize, _config: &ConfigOptions, ) -> Result<Option<Arc<dyn ExecutionPlan>>>

fn metrics(&self) -> Option<MetricsSet>

fn statistics(&self) -> Result<Statistics>

fn supports_limit_pushdown(&self) -> bool

fn with_fetch(&self, _limit: Option<usize>) -> Option<Arc<dyn ExecutionPlan>>

fn fetch(&self) -> Option<usize>

fn cardinality_effect(&self) -> CardinalityEffect

Trait Implementations§

impl DynTreeNode for dyn ExecutionPlan

fn arc_children(&self) -> Vec<&Arc<Self>>

fn with_new_arc_children( &self, arc_self: Arc<Self>, new_children: Vec<Arc<Self>>, ) -> Result<Arc<Self>>

impl ExecutionPlanProperties for &dyn ExecutionPlan

fn output_partitioning(&self) -> &Partitioning

fn execution_mode(&self) -> ExecutionMode

fn output_ordering(&self) -> Option<LexOrderingRef<'_>>

fn equivalence_properties(&self) -> &EquivalenceProperties

Implementors§

impl ExecutionPlan for AggregateExec

impl ExecutionPlan for AnalyzeExec

impl ExecutionPlan for CoalesceBatchesExec

impl ExecutionPlan for CoalescePartitionsExec

impl ExecutionPlan for EmptyExec

impl ExecutionPlan for ExplainExec

impl ExecutionPlan for FilterExec

impl ExecutionPlan for DataSinkExec

impl ExecutionPlan for CrossJoinExec

impl ExecutionPlan for HashJoinExec

impl ExecutionPlan for NestedLoopJoinExec

impl ExecutionPlan for SortMergeJoinExec

impl ExecutionPlan for SymmetricHashJoinExec

impl ExecutionPlan for GlobalLimitExec

impl ExecutionPlan for LocalLimitExec

impl ExecutionPlan for MemoryExec

impl ExecutionPlan for PlaceholderRowExec

impl ExecutionPlan for ProjectionExec

impl ExecutionPlan for RecursiveQueryExec

impl ExecutionPlan for RepartitionExec

impl ExecutionPlan for PartialSortExec

impl ExecutionPlan for SortExec

impl ExecutionPlan for SortPreservingMergeExec

impl ExecutionPlan for StreamingTableExec

impl ExecutionPlan for InterleaveExec

impl ExecutionPlan for UnionExec

impl ExecutionPlan for UnnestExec

impl ExecutionPlan for ValuesExec

impl ExecutionPlan for BoundedWindowAggExec

impl ExecutionPlan for WindowAggExec

impl ExecutionPlan for WorkTableExec

Trait ExecutionPlan

§Return Precomputed `RecordBatch`

§Lazily (async) Compute `RecordBatch`

fn static_name() -> &'static str
where Self: Sized,