Trait ExecutionPlan

pub trait ExecutionPlan:
    + DisplayAs
    + Send
    + Sync {
Show 21 methods // Required methods fn name(&self) -> &str; fn as_any(&self) -> &dyn Any; fn properties(&self) -> &PlanProperties; fn children(&self) -> Vec<&Arc<dyn ExecutionPlan>>; fn with_new_children( self: Arc<Self>, children: Vec<Arc<dyn ExecutionPlan>>, ) -> Result<Arc<dyn ExecutionPlan>>; fn execute( &self, partition: usize, context: Arc<TaskContext>, ) -> Result<SendableRecordBatchStream>; // Provided methods fn static_name() -> &'static str where Self: Sized { ... } fn schema(&self) -> SchemaRef { ... } fn check_invariants(&self, _check: InvariantLevel) -> Result<()> { ... } fn required_input_distribution(&self) -> Vec<Distribution> { ... } fn required_input_ordering(&self) -> Vec<Option<LexRequirement>> { ... } fn maintains_input_order(&self) -> Vec<bool> { ... } fn benefits_from_input_partitioning(&self) -> Vec<bool> { ... } fn repartitioned( &self, _target_partitions: usize, _config: &ConfigOptions, ) -> Result<Option<Arc<dyn ExecutionPlan>>> { ... } fn metrics(&self) -> Option<MetricsSet> { ... } fn statistics(&self) -> Result<Statistics> { ... } fn supports_limit_pushdown(&self) -> bool { ... } fn with_fetch( &self, _limit: Option<usize>, ) -> Option<Arc<dyn ExecutionPlan>> { ... } fn fetch(&self) -> Option<usize> { ... } fn cardinality_effect(&self) -> CardinalityEffect { ... } fn try_swapping_with_projection( &self, _projection: &ProjectionExec, ) -> Result<Option<Arc<dyn ExecutionPlan>>> { ... }
Expand description

Represent nodes in the DataFusion Physical Plan.

Calling execute produces an async SendableRecordBatchStream of RecordBatch that incrementally computes a partition of the ExecutionPlan’s output from its input. See Partitioning for more details on partitioning.

Methods such as Self::schema and Self::properties communicate properties of the output to the DataFusion optimizer, and methods such as required_input_distribution and required_input_ordering express requirements of the ExecutionPlan from its input.

ExecutionPlan can be displayed in a simplified form using the return value from displayable in addition to the (normally quite verbose) Debug output.

Required Methods§


fn name(&self) -> &str

Short name for the ExecutionPlan, such as ‘ParquetExec’.

Implementation note: this method can just proxy to static_name if no special action is needed. It doesn’t provide a default implementation like that because this method doesn’t require the Sized constrain to allow a wilder range of use cases.


fn as_any(&self) -> &dyn Any

Returns the execution plan as Any so that it can be downcast to a specific implementation.


fn properties(&self) -> &PlanProperties

Return properties of the output of the ExecutionPlan, such as output ordering(s), partitioning information etc.

This information is available via methods on ExecutionPlanProperties trait, which is implemented for all ExecutionPlans.


fn children(&self) -> Vec<&Arc<dyn ExecutionPlan>>

Get a list of children ExecutionPlans that act as inputs to this plan. The returned list will be empty for leaf nodes such as scans, will contain a single value for unary nodes, or two values for binary nodes (such as joins).


fn with_new_children( self: Arc<Self>, children: Vec<Arc<dyn ExecutionPlan>>, ) -> Result<Arc<dyn ExecutionPlan>>

Returns a new ExecutionPlan where all existing children were replaced by the children, in order


fn execute( &self, partition: usize, context: Arc<TaskContext>, ) -> Result<SendableRecordBatchStream>

Begin execution of partition, returning a Stream of RecordBatches.


The execute method itself is not async but it returns an async futures::stream::Stream. This Stream should incrementally compute the output, RecordBatch by RecordBatch (in a streaming fashion). Most ExecutionPlans should not do any work before the first RecordBatch is requested from the stream.

RecordBatchStreamAdapter can be used to convert an async Stream into a SendableRecordBatchStream.

Using async Streams allows for network I/O during execution and takes advantage of Rust’s built in support for async continuations and crate ecosystem.

§Error handling

Any error that occurs during execution is sent as an Err in the output stream.

ExecutionPlan implementations in DataFusion cancel additional work immediately once an error occurs. The rationale is that if the overall query will return an error, any additional work such as continued polling of inputs will be wasted as it will be thrown away.

§Cancellation / Aborting Execution

The Stream that is returned must ensure that any allocated resources are freed when the stream itself is dropped. This is particularly important for spawned tasks or threads. Unless care is taken to “abort” such tasks, they may continue to consume resources even after the plan is dropped, generating intermediate results that are never used. Thus, spawn is disallowed, and instead use SpawnedTask.

For more details see SpawnedTask, JoinSet and RecordBatchReceiverStreamBuilder for structures to help ensure all background tasks are cancelled.

§Implementation Examples

While async Streams have a non trivial learning curve, the futures crate provides StreamExt and TryStreamExt which help simplify many common operations.

Here are some common patterns:

§Return Precomputed RecordBatch

We can return a precomputed RecordBatch as a Stream:

struct MyPlan {
    batch: RecordBatch,

impl MyPlan {
    fn execute(
        partition: usize,
        context: Arc<TaskContext>
    ) -> Result<SendableRecordBatchStream> {
        // use functions from futures crate convert the batch into a stream
        let fut = futures::future::ready(Ok(self.batch.clone()));
        let stream = futures::stream::once(fut);
        Ok(Box::pin(RecordBatchStreamAdapter::new(self.batch.schema(), stream)))
§Lazily (async) Compute RecordBatch

We can also lazily compute a RecordBatch when the returned Stream is polled

struct MyPlan {
    schema: SchemaRef,

/// Returns a single batch when the returned stream is polled
async fn get_batch() -> Result<RecordBatch> {

impl MyPlan {
    fn execute(
        partition: usize,
        context: Arc<TaskContext>
    ) -> Result<SendableRecordBatchStream> {
        let fut = get_batch();
        let stream = futures::stream::once(fut);
        Ok(Box::pin(RecordBatchStreamAdapter::new(self.schema.clone(), stream)))
§Lazily (async) create a Stream

If you need to create the return Stream using an async function, you can do so by flattening the result:

struct MyPlan {
    schema: SchemaRef,

/// async function that returns a stream
async fn get_batch_stream() -> Result<SendableRecordBatchStream> {

impl MyPlan {
    fn execute(
        partition: usize,
        context: Arc<TaskContext>
    ) -> Result<SendableRecordBatchStream> {
        // A future that yields a stream
        let fut = get_batch_stream();
        // Use TryStreamExt::try_flatten to flatten the stream of streams
        let stream = futures::stream::once(fut).try_flatten();
        Ok(Box::pin(RecordBatchStreamAdapter::new(self.schema.clone(), stream)))

Provided Methods§


fn static_name() -> &'static str
where Self: Sized,

Short name for the ExecutionPlan, such as ‘ParquetExec’. Like name but can be called without an instance.


fn schema(&self) -> SchemaRef

Get the schema for this execution plan


fn check_invariants(&self, _check: InvariantLevel) -> Result<()>

Returns an error if this individual node does not conform to its invariants. These invariants are typically only checked in debug mode.

A default set of invariants is provided in the default implementation. Extension nodes can provide their own invariants.


fn required_input_distribution(&self) -> Vec<Distribution>

Specifies the data distribution requirements for all the children for this ExecutionPlan, By default it’s [Distribution::UnspecifiedDistribution] for each child,


fn required_input_ordering(&self) -> Vec<Option<LexRequirement>>

Specifies the ordering required for all of the children of this ExecutionPlan.

For each child, it’s the local ordering requirement within each partition rather than the global ordering

NOTE that checking !is_empty() does not check for a required input ordering. Instead, the correct check is that at least one entry must be Some


fn maintains_input_order(&self) -> Vec<bool>

Returns false if this ExecutionPlan’s implementation may reorder rows within or between partitions.

For example, Projection, Filter, and Limit maintain the order of inputs – they may transform values (Projection) or not produce the same number of rows that went in (Filter and Limit), but the rows that are produced go in the same way.

DataFusion uses this metadata to apply certain optimizations such as automatically repartitioning correctly.

The default implementation returns false

WARNING: if you override this default, you MUST ensure that the ExecutionPlan’s maintains the ordering invariant or else DataFusion may produce incorrect results.


fn benefits_from_input_partitioning(&self) -> Vec<bool>

Specifies whether the ExecutionPlan benefits from increased parallelization at its input for each child.

If returns true, the ExecutionPlan would benefit from partitioning its corresponding child (and thus from more parallelism). For ExecutionPlan that do very little work the overhead of extra parallelism may outweigh any benefits

The default implementation returns true unless this ExecutionPlan has signalled it requires a single child input partition.


fn repartitioned( &self, _target_partitions: usize, _config: &ConfigOptions, ) -> Result<Option<Arc<dyn ExecutionPlan>>>

If supported, attempt to increase the partitioning of this ExecutionPlan to produce target_partitions partitions.

If the ExecutionPlan does not support changing its partitioning, returns Ok(None) (the default).

It is the ExecutionPlan can increase its partitioning, but not to the target_partitions, it may return an ExecutionPlan with fewer partitions. This might happen, for example, if each new partition would be too small to be efficiently processed individually.

The DataFusion optimizer attempts to use as many threads as possible by repartitioning its inputs to match the target number of threads available (target_partitions). Some data sources, such as the built in CSV and Parquet readers, implement this method as they are able to read from their input files in parallel, regardless of how the source data is split amongst files.


fn metrics(&self) -> Option<MetricsSet>

Return a snapshot of the set of Metrics for this ExecutionPlan. If no Metrics are available, return None.

While the values of the metrics in the returned MetricsSets may change as execution progresses, the specific metrics will not.

Once self.execute() has returned (technically the future is resolved) for all available partitions, the set of metrics should be complete. If this function is called prior to execute() new metrics may appear in subsequent calls.


fn statistics(&self) -> Result<Statistics>

Returns statistics for this ExecutionPlan node. If statistics are not available, should return Statistics::new_unknown (the default), not an error.

For TableScan executors, which supports filter pushdown, special attention needs to be paid to whether the stats returned by this method are exact or not


fn supports_limit_pushdown(&self) -> bool

Returns true if a limit can be safely pushed down through this ExecutionPlan node.

If this method returns true, and the query plan contains a limit at the output of this node, DataFusion will push the limit to the input of this node.


fn with_fetch(&self, _limit: Option<usize>) -> Option<Arc<dyn ExecutionPlan>>

Returns a fetching variant of this ExecutionPlan node, if it supports fetch limits. Returns None otherwise.


fn fetch(&self) -> Option<usize>

Gets the fetch count for the operator, None means there is no fetch.


fn cardinality_effect(&self) -> CardinalityEffect

Gets the effect on cardinality, if known


fn try_swapping_with_projection( &self, _projection: &ProjectionExec, ) -> Result<Option<Arc<dyn ExecutionPlan>>>

Attempts to push down the given projection into the input of this ExecutionPlan.

If the operator supports this optimization, the resulting plan will be: self_new <- projection <- source, starting from projection <- self <- source. Otherwise, it returns the current ExecutionPlan as-is.

Returns Ok(Some(...)) if pushdown is applied, Ok(None) if it is not supported or not possible, or Err on failure.

Trait Implementations§


impl DynTreeNode for dyn ExecutionPlan


fn arc_children(&self) -> Vec<&Arc<Self>>

Returns all children of the specified TreeNode.

fn with_new_arc_children( &self, arc_self: Arc<Self>, new_children: Vec<Arc<Self>>, ) -> Result<Arc<Self>>

Constructs a new node with the specified children.

impl ExecutionPlanProperties for &dyn ExecutionPlan


fn output_partitioning(&self) -> &Partitioning

Specifies how the output of this ExecutionPlan is split into partitions.

fn output_ordering(&self) -> Option<&LexOrdering>

If the output of this ExecutionPlan within each partition is sorted, returns Some(keys) describing the ordering. A None return value indicates no assumptions should be made on the output ordering. Read more

fn boundedness(&self) -> Boundedness

Boundedness information of the stream corresponding to this ExecutionPlan. For more details, see Boundedness.

fn pipeline_behavior(&self) -> EmissionType

Indicates how the stream of this ExecutionPlan emits its results. For more details, see EmissionType.

fn equivalence_properties(&self) -> &EquivalenceProperties

Get the EquivalenceProperties within the plan. Read more
