pub trait PhysicalExpr: Send + Sync {
// Required methods
fn evaluate(
&self,
df: &DataFrame,
_state: &ExecutionState,
) -> PolarsResult<Column>;
fn evaluate_on_groups<'a>(
&self,
df: &DataFrame,
groups: &'a GroupsProxy,
state: &ExecutionState,
) -> PolarsResult<AggregationContext<'a>>;
fn to_field(&self, input_schema: &Schema) -> PolarsResult<Field>;
fn is_scalar(&self) -> bool;
// Provided methods
fn as_expression(&self) -> Option<&Expr> { ... }
fn evaluate_inline(&self) -> Option<Column> { ... }
fn evaluate_inline_impl(&self, _depth_limit: u8) -> Option<Column> { ... }
fn as_partitioned_aggregator(&self) -> Option<&dyn PartitionedAggregation> { ... }
fn as_stats_evaluator(&self) -> Option<&dyn StatsEvaluator> { ... }
fn is_literal(&self) -> bool { ... }
}
Expand description
Take a DataFrame and evaluate the expressions. Implement this for Column, lt, eq, etc
Required Methods§
Sourcefn evaluate(
&self,
df: &DataFrame,
_state: &ExecutionState,
) -> PolarsResult<Column>
fn evaluate( &self, df: &DataFrame, _state: &ExecutionState, ) -> PolarsResult<Column>
Take a DataFrame and evaluate the expression.
Sourcefn evaluate_on_groups<'a>(
&self,
df: &DataFrame,
groups: &'a GroupsProxy,
state: &ExecutionState,
) -> PolarsResult<AggregationContext<'a>>
fn evaluate_on_groups<'a>( &self, df: &DataFrame, groups: &'a GroupsProxy, state: &ExecutionState, ) -> PolarsResult<AggregationContext<'a>>
Some expression that are not aggregations can be done per group Think of sort, slice, filter, shift, etc. defaults to ignoring the group
This method is called by an aggregation function.
In case of a simple expr, like ‘column’, the groups are ignored and the column is returned. In case of an expr where group behavior makes sense, this method is called. For a filter operation for instance, a Series is created per groups and filtered.
An implementation of this method may apply an aggregation on the groups only. For instance
on a shift, the groups are first aggregated to a ListChunked
and the shift is applied per
group. The implementation then has to return the Series
exploded (because a later aggregation
will use the group tuples to aggregate). The group tuples also have to be updated, because
aggregation to a list sorts the exploded Series
by group.
This has some gotcha’s. An implementation may also change the group tuples instead of
the Series
.
Sourcefn to_field(&self, input_schema: &Schema) -> PolarsResult<Field>
fn to_field(&self, input_schema: &Schema) -> PolarsResult<Field>
Get the output field of this expr
fn is_scalar(&self) -> bool
Provided Methods§
fn as_expression(&self) -> Option<&Expr>
Sourcefn evaluate_inline(&self) -> Option<Column>
fn evaluate_inline(&self) -> Option<Column>
Attempt to cheaply evaluate this expression in-line without a DataFrame context. This is used by StatsEvaluator when skipping files / row groups using a predicate. TODO: Maybe in the future we can do this evaluation in-line at the optimizer stage?
Do not implement this directly - instead implement evaluate_inline_impl
Sourcefn evaluate_inline_impl(&self, _depth_limit: u8) -> Option<Column>
fn evaluate_inline_impl(&self, _depth_limit: u8) -> Option<Column>
Implementation of evaluate_inline
Sourcefn as_partitioned_aggregator(&self) -> Option<&dyn PartitionedAggregation>
fn as_partitioned_aggregator(&self) -> Option<&dyn PartitionedAggregation>
Convert to a partitioned aggregator.
Sourcefn as_stats_evaluator(&self) -> Option<&dyn StatsEvaluator>
fn as_stats_evaluator(&self) -> Option<&dyn StatsEvaluator>
Can take &dyn Statistics and determine of a file should be
read -> true
or not -> false