datafusion_physical_expr::equivalence

Struct EquivalenceProperties

Source
pub struct EquivalenceProperties {
    pub eq_group: EquivalenceGroup,
    pub oeq_class: OrderingEquivalenceClass,
    pub constants: Vec<ConstExpr>,
    /* private fields */
}
Expand description

A EquivalenceProperties object stores information known about the output of a plan node, that can be used to optimize the plan.

Currently, it keeps track of:

  • Sort expressions (orderings)
  • Equivalent expressions: expressions that are known to have same value.
  • Constants expressions: expressions that are known to contain a single constant value.

§Example equivalent sort expressions

Consider table below:

┌-------┐
| a | b |
|---|---|
| 1 | 9 |
| 2 | 8 |
| 3 | 7 |
| 5 | 5 |
└---┴---┘

In this case, both a ASC and b DESC can describe the table ordering. EquivalenceProperties, tracks these different valid sort expressions and treat a ASC and b DESC on an equal footing. For example if the query specifies the output sorted by EITHER a ASC or b DESC, the sort can be avoided.

§Example equivalent expressions

Similarly, consider the table below:

┌-------┐
| a | b |
|---|---|
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
| 5 | 5 |
└---┴---┘

In this case, columns a and b always have the same value, which can of such equivalences inside this object. With this information, Datafusion can optimize operations such as. For example, if the partition requirement is Hash(a) and output partitioning is Hash(b), then DataFusion avoids repartitioning the data as the existing partitioning satisfies the requirement.

§Code Example

use datafusion_physical_expr_common::sort_expr::{LexOrdering, PhysicalSortExpr};
// This object represents data that is sorted by a ASC, c DESC
// with a single constant value of b
let mut eq_properties = EquivalenceProperties::new(schema)
  .with_constants(vec![ConstExpr::from(col_b)]);
eq_properties.add_new_ordering(LexOrdering::new(vec![
  PhysicalSortExpr::new_default(col_a).asc(),
  PhysicalSortExpr::new_default(col_c).desc(),
]));

assert_eq!(eq_properties.to_string(), "order: [[a@0 ASC, c@2 DESC]], const: [b@1]")

Fields§

§eq_group: EquivalenceGroup

Collection of equivalence classes that store expressions with the same value.

§oeq_class: OrderingEquivalenceClass

Equivalent sort expressions for this table.

§constants: Vec<ConstExpr>

Expressions whose values are constant throughout the table. TODO: We do not need to track constants separately, they can be tracked inside eq_groups as Literal expressions.

Implementations§

Source§

impl EquivalenceProperties

Source

pub fn new(schema: SchemaRef) -> Self

Creates an empty EquivalenceProperties object.

Source

pub fn new_with_orderings(schema: SchemaRef, orderings: &[LexOrdering]) -> Self

Creates a new EquivalenceProperties object with the given orderings.

Source

pub fn schema(&self) -> &SchemaRef

Returns the associated schema.

Source

pub fn oeq_class(&self) -> &OrderingEquivalenceClass

Returns a reference to the ordering equivalence class within.

Source

pub fn eq_group(&self) -> &EquivalenceGroup

Returns a reference to the equivalence group within.

Source

pub fn constants(&self) -> &[ConstExpr]

Returns a reference to the constant expressions

Source

pub fn output_ordering(&self) -> Option<LexOrdering>

Returns the output ordering of the properties.

Source

pub fn normalized_oeq_class(&self) -> OrderingEquivalenceClass

Returns the normalized version of the ordering equivalence class within. Normalization removes constants and duplicates as well as standardizing expressions according to the equivalence group within.

Source

pub fn extend(self, other: Self) -> Self

Extends this EquivalenceProperties with the other object.

Source

pub fn clear_orderings(&mut self)

Clears (empties) the ordering equivalence class within this object. Call this method when existing orderings are invalidated.

Source

pub fn clear_per_partition_constants(&mut self)

Removes constant expressions that may change across partitions. This method should be used when data from different partitions are merged.

Source

pub fn add_ordering_equivalence_class( &mut self, other: OrderingEquivalenceClass, )

Extends this EquivalenceProperties by adding the orderings inside the ordering equivalence class other.

Source

pub fn add_new_orderings( &mut self, orderings: impl IntoIterator<Item = LexOrdering>, )

Adds new orderings into the existing ordering equivalence class.

Source

pub fn add_new_ordering(&mut self, ordering: LexOrdering)

Adds a single ordering to the existing ordering equivalence class.

Source

pub fn add_equivalence_group(&mut self, other_eq_group: EquivalenceGroup)

Incorporates the given equivalence group to into the existing equivalence group within.

Source

pub fn add_equal_conditions( &mut self, left: &Arc<dyn PhysicalExpr>, right: &Arc<dyn PhysicalExpr>, ) -> Result<()>

Adds a new equality condition into the existing equivalence group. If the given equality defines a new equivalence class, adds this new equivalence class to the equivalence group.

Source

pub fn add_constants( self, constants: impl IntoIterator<Item = ConstExpr>, ) -> Self

👎Deprecated since 43.0.0: Use [with_constants] instead

Track/register physical expressions with constant values.

Source

pub fn remove_constant(self, c: &ConstExpr) -> Self

Remove the specified constant

Source

pub fn with_constants( self, constants: impl IntoIterator<Item = ConstExpr>, ) -> Self

Track/register physical expressions with constant values.

Source

pub fn with_reorder(self, sort_exprs: LexOrdering) -> Self

Updates the ordering equivalence group within assuming that the table is re-sorted according to the argument sort_exprs. Note that constants and equivalence classes are unchanged as they are unaffected by a re-sort.

Source

pub fn ordering_satisfy(&self, given: LexOrderingRef<'_>) -> bool

Checks whether the given ordering is satisfied by any of the existing orderings.

Source

pub fn ordering_satisfy_requirement(&self, reqs: LexRequirementRef<'_>) -> bool

Checks whether the given sort requirements are satisfied by any of the existing orderings.

Source

pub fn requirements_compatible( &self, given: LexRequirementRef<'_>, reference: LexRequirementRef<'_>, ) -> bool

Checks whether the given`` sort requirements are equal or more specific than the reference` sort requirements.

Source

pub fn get_finer_ordering( &self, lhs: LexOrderingRef<'_>, rhs: LexOrderingRef<'_>, ) -> Option<LexOrdering>

Returns the finer ordering among the orderings lhs and rhs, breaking any ties by choosing lhs.

The finer ordering is the ordering that satisfies both of the orderings. If the orderings are incomparable, returns None.

For example, the finer ordering among [a ASC] and [a ASC, b ASC] is the latter.

Source

pub fn get_finer_requirement( &self, req1: LexRequirementRef<'_>, req2: LexRequirementRef<'_>, ) -> Option<LexRequirement>

Returns the finer ordering among the requirements lhs and rhs, breaking any ties by choosing lhs.

The finer requirements are the ones that satisfy both of the given requirements. If the requirements are incomparable, returns None.

For example, the finer requirements among [a ASC] and [a ASC, b ASC] is the latter.

Source

pub fn substitute_ordering_component( &self, mapping: &ProjectionMapping, sort_expr: LexOrderingRef<'_>, ) -> Result<Vec<LexOrdering>>

we substitute the ordering according to input expression type, this is a simplified version In this case, we just substitute when the expression satisfy the following condition: I. just have one column and is a CAST expression TODO: Add one-to-ones analysis for monotonic ScalarFunctions. TODO: we could precompute all the scenario that is computable, for example: atan(x + 1000) should also be substituted if x is DESC or ASC After substitution, we may generate more than 1 LexOrdering. As an example, [a ASC, b ASC] will turn into [a ASC, b ASC], [CAST(a) ASC, b ASC] when projection expressions a, b, CAST(a) is applied.

Source

pub fn substitute_oeq_class( &mut self, mapping: &ProjectionMapping, ) -> Result<()>

In projection, supposed we have a input function ‘A DESC B DESC’ and the output shares the same expression with A and B, we could surely use the ordering of the original ordering, However, if the A has been changed, for example, A-> Cast(A, Int64) or any other form, it is invalid if we continue using the original ordering Since it would cause bug in dependency constructions, we should substitute the input order in order to get correct dependency map, happen in issue 8838: https://github.com/apache/datafusion/issues/8838

Source

pub fn project_expr( &self, expr: &Arc<dyn PhysicalExpr>, projection_mapping: &ProjectionMapping, ) -> Option<Arc<dyn PhysicalExpr>>

Projects argument expr according to projection_mapping, taking equivalences into account.

For example, assume that columns a and c are always equal, and that projection_mapping encodes following mapping:

a -> a1
b -> b1

Then, this function projects a + b to Some(a1 + b1), c + b to Some(a1 + b1) and d to None, meaning that it cannot be projected.

Source

pub fn project( &self, projection_mapping: &ProjectionMapping, output_schema: SchemaRef, ) -> Self

Projects the equivalences within according to projection_mapping and output_schema.

Source

pub fn find_longest_permutation( &self, exprs: &[Arc<dyn PhysicalExpr>], ) -> (LexOrdering, Vec<usize>)

Returns the longest (potentially partial) permutation satisfying the existing ordering. For example, if we have the equivalent orderings [a ASC, b ASC] and [c DESC], with exprs containing [c, b, a, d], then this function returns ([a ASC, b ASC, c DESC], [2, 1, 0]). This means that the specification [a ASC, b ASC, c DESC] is satisfied by the existing ordering, and [a, b, c] resides at indices: 2, 1, 0 inside the argument exprs (respectively). For the mathematical definition of “partial permutation”, see:

https://en.wikipedia.org/wiki/Permutation#k-permutations_of_n

Source

pub fn is_expr_constant(&self, expr: &Arc<dyn PhysicalExpr>) -> bool

This function determines whether the provided expression is constant based on the known constants.

§Arguments
  • expr: A reference to a Arc<dyn PhysicalExpr> representing the expression to be checked.
§Returns

Returns true if the expression is constant according to equivalence group, false otherwise.

Source

pub fn get_expr_properties(&self, expr: Arc<dyn PhysicalExpr>) -> ExprProperties

Retrieves the properties for a given physical expression.

This function constructs an ExprProperties object for the given expression, which encapsulates information about the expression’s properties, including its SortProperties and Interval.

§Parameters
  • expr: An Arc<dyn PhysicalExpr> representing the physical expression for which ordering information is sought.
§Returns

Returns an ExprProperties object containing the ordering and range information for the given expression.

Source

pub fn with_new_schema(self, schema: SchemaRef) -> Result<Self>

Transforms this EquivalenceProperties into a new EquivalenceProperties by mapping columns in the original schema to columns in the new schema by index.

Trait Implementations§

Source§

impl Clone for EquivalenceProperties

Source§

fn clone(&self) -> EquivalenceProperties

Returns a copy of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for EquivalenceProperties

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl Display for EquivalenceProperties

More readable display version of the EquivalenceProperties.

Format:

order: [[a ASC, b ASC], [a ASC, c ASC]], eq: [[a = b], [a = c]], const: [a = 1]
Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dst: *mut T)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dst. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T> ToString for T
where T: Display + ?Sized,

Source§

default fn to_string(&self) -> String

Converts the given value to a String. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.