datafusion_expr

Trait WindowUDFImpl

Source
pub trait WindowUDFImpl:
    Debug
    + Send
    + Sync {
Show 14 methods // Required methods fn as_any(&self) -> &dyn Any; fn name(&self) -> &str; fn signature(&self) -> &Signature; fn partition_evaluator( &self, partition_evaluator_args: PartitionEvaluatorArgs<'_>, ) -> Result<Box<dyn PartitionEvaluator>>; fn field(&self, field_args: WindowUDFFieldArgs<'_>) -> Result<Field>; // Provided methods fn expressions( &self, expr_args: ExpressionArgs<'_>, ) -> Vec<Arc<dyn PhysicalExpr>> { ... } fn aliases(&self) -> &[String] { ... } fn simplify(&self) -> Option<WindowFunctionSimplification> { ... } fn equals(&self, other: &dyn WindowUDFImpl) -> bool { ... } fn hash_value(&self) -> u64 { ... } fn sort_options(&self) -> Option<SortOptions> { ... } fn coerce_types(&self, _arg_types: &[DataType]) -> Result<Vec<DataType>> { ... } fn reverse_expr(&self) -> ReversedUDWF { ... } fn documentation(&self) -> Option<&Documentation> { ... }
}
Expand description

Trait for implementing WindowUDF.

This trait exposes the full API for implementing user defined window functions and can be used to implement any function.

See advanced_udwf.rs for a full example with complete implementation and WindowUDF for other available options.

§Basic Example


#[derive(Debug, Clone)]
struct SmoothIt {
  signature: Signature,
}

impl SmoothIt {
  fn new() -> Self {
    Self {
      signature: Signature::uniform(1, vec![DataType::Int32], Volatility::Immutable),
     }
  }
}

static DOCUMENTATION: OnceLock<Documentation> = OnceLock::new();

fn get_doc() -> &'static Documentation {
    DOCUMENTATION.get_or_init(|| {
        Documentation::builder()
            .with_doc_section(DOC_SECTION_ANALYTICAL)
            .with_description("smooths the windows")
            .with_syntax_example("smooth_it(2)")
            .with_argument("arg1", "The int32 number to smooth by")
            .build()
            .unwrap()
    })
}

/// Implement the WindowUDFImpl trait for SmoothIt
impl WindowUDFImpl for SmoothIt {
   fn as_any(&self) -> &dyn Any { self }
   fn name(&self) -> &str { "smooth_it" }
   fn signature(&self) -> &Signature { &self.signature }
   // The actual implementation would smooth the window
   fn partition_evaluator(
       &self,
       _partition_evaluator_args: PartitionEvaluatorArgs,
   ) -> Result<Box<dyn PartitionEvaluator>> {
       unimplemented!()
   }
   fn field(&self, field_args: WindowUDFFieldArgs) -> Result<Field> {
     if let Some(DataType::Int32) = field_args.get_input_type(0) {
       Ok(Field::new(field_args.name(), DataType::Int32, false))
     } else {
       plan_err!("smooth_it only accepts Int32 arguments")
     }
   }
   fn documentation(&self) -> Option<&Documentation> {
     Some(get_doc())
   }
}

// Create a new WindowUDF from the implementation
let smooth_it = WindowUDF::from(SmoothIt::new());

// Call the function `add_one(col)`
// smooth_it(speed) OVER (PARTITION BY car ORDER BY time ASC)
let expr = smooth_it.call(vec![col("speed")])
    .partition_by(vec![col("car")])
    .order_by(vec![col("time").sort(true, true)])
    .window_frame(WindowFrame::new(None))
    .build()
    .unwrap();

Required Methods§

Source

fn as_any(&self) -> &dyn Any

Returns this object as an Any trait object

Source

fn name(&self) -> &str

Returns this function’s name

Source

fn signature(&self) -> &Signature

Returns the function’s Signature for information about what input types are accepted and the function’s Volatility.

Source

fn partition_evaluator( &self, partition_evaluator_args: PartitionEvaluatorArgs<'_>, ) -> Result<Box<dyn PartitionEvaluator>>

Invoke the function, returning the PartitionEvaluator instance

Source

fn field(&self, field_args: WindowUDFFieldArgs<'_>) -> Result<Field>

The Field of the final result of evaluating this window function.

Call field_args.name() to get the fully qualified name for defining the Field. For a complete example see the implementation in the Basic Example section.

Provided Methods§

Source

fn expressions( &self, expr_args: ExpressionArgs<'_>, ) -> Vec<Arc<dyn PhysicalExpr>>

Returns the expressions that are passed to the PartitionEvaluator.

Source

fn aliases(&self) -> &[String]

Returns any aliases (alternate names) for this function.

Note: aliases should only include names other than Self::name. Defaults to [] (no aliases)

Source

fn simplify(&self) -> Option<WindowFunctionSimplification>

Optionally apply per-UDWF simplification / rewrite rules.

This can be used to apply function specific simplification rules during optimization. The default implementation does nothing.

Note that DataFusion handles simplifying arguments and “constant folding” (replacing a function call with constant arguments such as my_add(1,2) --> 3 ). Thus, there is no need to implement such optimizations manually for specific UDFs.

Example: [simplify_udwf_expression.rs]: https://github.com/apache/arrow-datafusion/blob/main/datafusion-examples/examples/simplify_udwf_expression.rs

§Returns

None if simplify is not defined or,

Or, a closure with two arguments:

Source

fn equals(&self, other: &dyn WindowUDFImpl) -> bool

Return true if this window UDF is equal to the other.

Allows customizing the equality of window UDFs. Must be consistent with Self::hash_value and follow the same rules as Eq:

  • reflexive: a.equals(a);
  • symmetric: a.equals(b) implies b.equals(a);
  • transitive: a.equals(b) and b.equals(c) implies a.equals(c).

By default, compares Self::name and Self::signature.

Source

fn hash_value(&self) -> u64

Returns a hash value for this window UDF.

Allows customizing the hash code of window UDFs. Similarly to Hash and Eq, if Self::equals returns true for two UDFs, their hash_values must be the same.

By default, hashes Self::name and Self::signature.

Source

fn sort_options(&self) -> Option<SortOptions>

Allows the window UDF to define a custom result ordering.

By default, a window UDF doesn’t introduce an ordering. But when specified by a window UDF this is used to update ordering equivalences.

Source

fn coerce_types(&self, _arg_types: &[DataType]) -> Result<Vec<DataType>>

Coerce arguments of a function call to types that the function can evaluate.

This function is only called if WindowUDFImpl::signature returns crate::TypeSignature::UserDefined. Most UDWFs should return one of the other variants of TypeSignature which handle common cases

See the type coercion module documentation for more details on type coercion

For example, if your function requires a floating point arguments, but the user calls it like my_func(1::int) (aka with 1 as an integer), coerce_types could return [DataType::Float64] to ensure the argument was cast to 1::double

§Parameters
  • arg_types: The argument types of the arguments this function with
§Return value

A Vec the same length as arg_types. DataFusion will CAST the function call arguments to these specific types.

Source

fn reverse_expr(&self) -> ReversedUDWF

Allows customizing the behavior of the user-defined window function when it is evaluated in reverse order.

Source

fn documentation(&self) -> Option<&Documentation>

Returns the documentation for this Window UDF.

Documentation can be accessed programmatically as well as generating publicly facing documentation.

Trait Implementations§

Source§

impl PartialEq for dyn WindowUDFImpl

Source§

fn eq(&self, other: &Self) -> bool

Tests for self and other values to be equal, and is used by ==.
1.0.0 · Source§

fn ne(&self, other: &Rhs) -> bool

Tests for !=. The default implementation is almost always sufficient, and should not be overridden without very good reason.
Source§

impl PartialOrd for dyn WindowUDFImpl

Source§

fn partial_cmp(&self, other: &Self) -> Option<Ordering>

This method returns an ordering between self and other values if one exists. Read more
1.0.0 · Source§

fn lt(&self, other: &Rhs) -> bool

Tests less than (for self and other) and is used by the < operator. Read more
1.0.0 · Source§

fn le(&self, other: &Rhs) -> bool

Tests less than or equal to (for self and other) and is used by the <= operator. Read more
1.0.0 · Source§

fn gt(&self, other: &Rhs) -> bool

Tests greater than (for self and other) and is used by the > operator. Read more
1.0.0 · Source§

fn ge(&self, other: &Rhs) -> bool

Tests greater than or equal to (for self and other) and is used by the >= operator. Read more

Implementors§