Enum polars_plan::dsl::Expr

source ·
pub enum Expr {
Show 24 variants Alias(Box<Expr>, Arc<str>), Column(Arc<str>), Columns(Vec<String>), DtypeColumn(Vec<DataType>), Literal(LiteralValue), BinaryExpr { left: Box<Expr>, op: Operator, right: Box<Expr>, }, Cast { expr: Box<Expr>, data_type: DataType, strict: bool, }, Sort { expr: Box<Expr>, options: SortOptions, }, Take { expr: Box<Expr>, idx: Box<Expr>, }, SortBy { expr: Box<Expr>, by: Vec<Expr>, reverse: Vec<bool>, }, Agg(AggExpr), Ternary { predicate: Box<Expr>, truthy: Box<Expr>, falsy: Box<Expr>, }, Function { input: Vec<Expr>, function: FunctionExpr, options: FunctionOptions, }, Explode(Box<Expr>), Filter { input: Box<Expr>, by: Box<Expr>, }, Window { function: Box<Expr>, partition_by: Vec<Expr>, order_by: Option<Box<Expr>>, options: WindowOptions, }, Wildcard, Slice { input: Box<Expr>, offset: Box<Expr>, length: Box<Expr>, }, Exclude(Box<Expr>, Vec<Excluded>), KeepName(Box<Expr>), Count, Nth(i64), RenameAlias { function: SpecialEq<Arc<dyn RenameAliasFn>>, expr: Box<Expr>, }, AnonymousFunction { input: Vec<Expr>, function: SpecialEq<Arc<dyn SeriesUdf>>, output_type: GetOutput, options: FunctionOptions, },
}
Expand description

Queries consists of multiple expressions.

Variants§

§

Alias(Box<Expr>, Arc<str>)

§

Column(Arc<str>)

§

Columns(Vec<String>)

§

DtypeColumn(Vec<DataType>)

§

Literal(LiteralValue)

§

BinaryExpr

Fields

§left: Box<Expr>
§right: Box<Expr>
§

Cast

Fields

§expr: Box<Expr>
§data_type: DataType
§strict: bool
§

Sort

Fields

§expr: Box<Expr>
§options: SortOptions
§

Take

Fields

§expr: Box<Expr>
§idx: Box<Expr>
§

SortBy

Fields

§expr: Box<Expr>
§by: Vec<Expr>
§reverse: Vec<bool>
§

Agg(AggExpr)

§

Ternary

Fields

§predicate: Box<Expr>
§truthy: Box<Expr>
§falsy: Box<Expr>

A ternary operation if true then “foo” else “bar”

§

Function

Fields

§input: Vec<Expr>

function arguments

§function: FunctionExpr

function to apply

§

Explode(Box<Expr>)

§

Filter

Fields

§input: Box<Expr>
§by: Box<Expr>
§

Window

Fields

§function: Box<Expr>

Also has the input. i.e. avg(“foo”)

§partition_by: Vec<Expr>
§order_by: Option<Box<Expr>>
§options: WindowOptions

See postgres window functions

§

Wildcard

§

Slice

Fields

§input: Box<Expr>
§offset: Box<Expr>

length is not yet known so we accept negative offsets

§length: Box<Expr>
§

Exclude(Box<Expr>, Vec<Excluded>)

Can be used in a select statement to exclude a column from selection

§

KeepName(Box<Expr>)

Set root name as Alias

§

Count

Special case that does not need columns

§

Nth(i64)

Take the nth column in the DataFrame

§

RenameAlias

Fields

§function: SpecialEq<Arc<dyn RenameAliasFn>>
§expr: Box<Expr>
§

AnonymousFunction

Fields

§input: Vec<Expr>

function arguments

§function: SpecialEq<Arc<dyn SeriesUdf>>

function to apply

§output_type: GetOutput

output dtype of the function

Implementations§

Available on crate feature dot_diagram only.

Get a dot language representation of the Expression.

Get Field result of the expression. The schema is the input data.

Compare Expr with other Expr on equality

Compare Expr with other Expr on non-equality

Check if Expr < Expr

Check if Expr > Expr

Check if Expr >= Expr

Check if Expr <= Expr

Negate Expr

Rename Column.

Run is_null operation on Expr.

Run is_not_null operation on Expr.

Drop null values

Drop NaN values

Reduce groups to minimal value.

Reduce groups to maximum value.

Reduce groups to minimal value.

Reduce groups to maximum value.

Reduce groups to the mean value.

Reduce groups to the median value.

Reduce groups to the sum of all the values.

Get the number of unique values in the groups.

Get the first value in the group.

Get the last value in the group.

Aggregate the group to a Series

Compute the quantile per group.

Get the group indexes of the group by operation.

Alias for explode

Explode the utf8/ list column

Slice the Series. offset may be negative.

Append expressions. This is done by adding the chunks of other to this Series.

Get the first n elements of the Expr result

Get the last n elements of the Expr result

Get unique values of this expression.

Get unique values of this expression, while maintaining order. This requires more work than Expr::unique.

Get the first index of unique values of this expression.

Get the index value that has the minimum value

Get the index value that has the maximum value

Get the index values that would sort this expression.

Find indices where elements should be inserted to maintain order.

Cast expression to another data type. Throws an error if conversion had overflows

Cast expression to another data type.

Take the values by idx.

Sort in increasing order. See the eager implementation.

Sort with given options.

Returns the k largest elements.

This has time complexity O(n + k log(n)).

Reverse column

Apply a function/closure once the logical plan get executed.

This function is very similar to Expr::apply, but differs in how it handles aggregations.

  • map should be used for operations that are independent of groups, e.g. multiply * 2, or raise to the power
  • apply should be used for operations that work on a group of data. e.g. sum, count, etc.

It is the responsibility of the caller that the schema is correct by giving the correct output_type. If None given the output type of the input expr is used.

Apply a function/closure once the logical plan get executed with many arguments

See the Expr::map function for the differences between map and apply.

Apply a function/closure once the logical plan get executed.

This function is very similar to apply, but differs in how it handles aggregations.

  • map should be used for operations that are independent of groups, e.g. multiply * 2, or raise to the power
  • apply should be used for operations that work on a group of data. e.g. sum, count, etc.
  • map_list should be used when the function expects a list aggregated series.

A function that cannot be expressed with map or apply and requires extra settings.

Apply a function/closure over the groups. This should only be used in a groupby aggregation.

It is the responsibility of the caller that the schema is correct by giving the correct output_type. If None given the output type of the input expr is used.

This difference with map is that apply will create a separate Series per group.

  • map should be used for operations that are independent of groups, e.g. multiply * 2, or raise to the power
  • apply should be used for operations that work on a group of data. e.g. sum, count, etc.

Apply a function/closure over the groups with many arguments. This should only be used in a groupby aggregation.

See the Expr::apply function for the differences between map and apply.

Get mask of finite values if dtype is Float

Get mask of infinite values if dtype is Float

Get mask of NaN values if dtype is Float

Get inverse mask of NaN values if dtype is Float

Shift the values in the array by some period. See the eager implementation.

Shift the values in the array by some period and fill the resulting empty values.

Available on crate feature cum_agg only.

Get an array with the cumulative sum computed at every element

Available on crate feature cum_agg only.

Get an array with the cumulative product computed at every element

Available on crate feature cum_agg only.

Get an array with the cumulative min computed at every element

Available on crate feature cum_agg only.

Get an array with the cumulative max computed at every element

Available on crate feature product only.

Get the product aggregation of an expression

Fill missing value with next non-null.

Fill missing value with previous non-null.

Available on crate feature round_series only.

Round underlying floating point array to given decimal numbers.

Available on crate feature round_series only.

Floor underlying floating point array to the lowest integers smaller or equal to the float value.

Available on crate feature round_series only.

Ceil underlying floating point array to the highest integers smaller or equal to the float value.

Available on crate feature round_series only.

Clip underlying values to a set boundary.

Available on crate feature round_series only.

Clip underlying values to a set boundary.

Available on crate feature round_series only.

Clip underlying values to a set boundary.

Available on crate feature abs only.

Convert all values to their absolute/positive value.

Apply window function over a subgroup. This is similar to a groupby + aggregation + self join. Or similar to window functions in Postgres.

Example
#[macro_use] extern crate polars_core;
use polars_core::prelude::*;
use polars_lazy::prelude::*;

fn example() -> PolarsResult<()> {
    let df = df! {
            "groups" => &[1, 1, 2, 2, 1, 2, 3, 3, 1],
            "values" => &[1, 2, 3, 4, 5, 6, 7, 8, 8]
        }?;

    let out = df
     .lazy()
     .select(&[
         col("groups"),
         sum("values").over([col("groups")]),
     ])
     .collect()?;
    dbg!(&out);
    Ok(())
}

Outputs:

╭────────┬────────╮
│ groups ┆ values │
│ ---    ┆ ---    │
│ i32    ┆ i32    │
╞════════╪════════╡
│ 1      ┆ 16     │
├╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┤
│ 1      ┆ 16     │
├╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┤
│ 2      ┆ 13     │
├╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┤
│ 2      ┆ 13     │
├╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┤
│ ...    ┆ ...    │
├╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┤
│ 1      ┆ 16     │
├╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┤
│ 2      ┆ 13     │
├╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┤
│ 3      ┆ 15     │
├╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┤
│ 3      ┆ 15     │
├╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┤
│ 1      ┆ 16     │
╰────────┴────────╯

Replace the null values by a value.

Replace the floating point NaN values by a value.

Count the values of the Series or Get counts of the group by operation.

Standard deviation of the values of the Series

Variance of the values of the Series

Get a mask of duplicated values

Get a mask of unique values

and operation

or operation

Raise expression to the power exponent

Compute the sine of the given expression

Compute the cosine of the given expression

Compute the tangent of the given expression

Compute the inverse sine of the given expression

Compute the inverse cosine of the given expression

Compute the inverse tangent of the given expression

Compute the hyperbolic sine of the given expression

Compute the hyperbolic cosine of the given expression

Compute the hyperbolic tangent of the given expression

Compute the inverse hyperbolic sine of the given expression

Compute the inverse hyperbolic cosine of the given expression

Compute the inverse hyperbolic tangent of the given expression

Compute the sign of the given expression

Filter a single column Should be used in aggregation context. If you want to filter on a DataFrame level, use LazyFrame::filter

Available on crate feature is_in only.

Check if the values of the left expression are in the lists of the right expr.

Sort this column by the ordering of another column. Can also be used in a groupby context to sort the groups.

Available on crate feature repeat_by only.

Repeat the column n times, where n is determined by the values in by. This yields an Expr of dtype List

Available on crate feature is_first only.

Get a mask of the first unique value.

Available on crate feature dot_product only.
Available on crate feature mode only.

Compute the mode(s) of this column. This is the most occurring value.

Keep the original root name

use polars_core::prelude::*;
use polars_lazy::prelude::*;

fn example(df: LazyFrame) -> LazyFrame {
    df.select([
// even thought the alias yields a different column name,
// `keep_name` will make sure that the original column name is used
        col("*").alias("foo").keep_name()
])
}

Define an alias by mapping a function over the original root column name.

Add a suffix to the root column name.

Add a prefix to the root column name.

Exclude a column from a wildcard/regex selection.

You may also use regexes in the exclude as long as they start with ^ and end with $/

Example
use polars_core::prelude::*;
use polars_lazy::prelude::*;

// Select all columns except foo.
fn example(df: DataFrame) -> LazyFrame {
      df.lazy()
        .select(&[
                col("*").exclude(&["foo"])
                ])
}
Available on crate feature interpolate only.
Available on crate feature rolling_window only.

Apply a rolling min See: [ChunkedArray::rolling_min]

Available on crate feature rolling_window only.

Apply a rolling max See: [ChunkedArray::rolling_max]

Available on crate feature rolling_window only.

Apply a rolling mean See: [ChunkedArray::rolling_mean]

Available on crate feature rolling_window only.

Apply a rolling sum See: [ChunkedArray::rolling_sum]

Available on crate feature rolling_window only.

Apply a rolling median See: [ChunkedArray::rolling_median]

Available on crate feature rolling_window only.

Apply a rolling quantile See: [ChunkedArray::rolling_quantile]

Available on crate feature rolling_window only.

Apply a rolling variance

Available on crate feature rolling_window only.

Apply a rolling std-dev

Available on crate features rolling_window and moment only.

Apply a rolling skew

Available on crate feature rolling_window only.

Apply a custom function over a rolling/ moving window of the array. This has quite some dynamic dispatch, so prefer rolling_min, max, mean, sum over this.

Available on crate feature rolling_window only.

Apply a custom function over a rolling/ moving window of the array. Prefer this over rolling_apply in case of floating point numbers as this is faster. This has quite some dynamic dispatch, so prefer rolling_min, max, mean, sum over this.

Available on crate feature rank only.
Available on crate feature diff only.
Available on crate feature pct_change only.
Available on crate feature moment only.

Compute the sample skewness of a data set.

For normally distributed data, the skewness should be about zero. For uni-modal continuous distributions, a skewness value greater than zero means that there is more weight in the right tail of the distribution. The function skewtest can be used to determine if the skewness value is close enough to zero, statistically speaking.

see: https://github.com/scipy/scipy/blob/47bb6febaa10658c72962b9615d5d5aa2513fa3a/scipy/stats/stats.py#L1024

Available on crate feature moment only.

Get maximal value that could be hold by this dtype.

Get minimal value that could be hold by this dtype.

Cumulatively count values from 0 to len.

Check if any boolean value is true

Shrink numeric columns to the minimal required datatype needed to fit the extrema of this Series. This can be used to reduce memory pressure.

Check if all boolean values are true

Available on crate feature dtype-struct only.

Count all unique values and create a struct mapping value to count Note that it is better to turn multithreaded off in the aggregation context

Available on crate feature unique_counts only.

Returns a count of the unique values in the order of appearance. This method differs from [Expr::value_counts] in that it does not return the values, only the counts and might be faster

Available on crate feature log only.

Compute the logarithm to a given base

Available on crate feature log only.

Calculate the exponential of all elements in the input array

Available on crate feature log only.

Compute the entropy as -sum(pk * log(pk). where pk are discrete probabilities.

Get the null count of the column/group

Set this Series as sorted so that downstream code can use fast paths for sorted arrays.

Warning

This can lead to incorrect results if this Series is not sorted!! Use with care!

Compute the hash of every element

Expr::mutate().apply(fn())

Trait Implementations§

The resulting type after applying the + operator.
Performs the + operation. Read more
Converts this type into a shared reference of the (usually inferred) input type.
Returns a copy of the value. Read more
Performs copy-assignment from source. Read more
Formats the value using the given formatter. Read more
Returns the “default value” for a type. Read more
Deserialize this value from the given Serde deserializer. Read more
Formats the value using the given formatter. Read more
The resulting type after applying the / operator.
Performs the / operation. Read more
Converts to this type from the input type.
Converts to this type from the input type.
Converts to this type from the input type.
Converts to this type from the input type.
Converts to this type from the input type.
Converts to this type from the input type.
Converts to this type from the input type.
Converts to this type from the input type.
Converts to this type from the input type.
Converts to this type from the input type.
Converts to this type from the input type.
Converts to this type from the input type.
Converts to this type from the input type.
Feeds this value into the given Hasher. Read more
Feeds a slice of this type into the given Hasher. Read more
The type of the elements being iterated over.
Which kind of iterator are we turning this into?
Creates an iterator from a value. Read more
The resulting type after applying the * operator.
Performs the * operation. Read more
This method tests for self and other values to be equal, and is used by ==.
This method tests for !=. The default implementation is almost always sufficient, and should not be overridden without very good reason.
The resulting type after applying the % operator.
Performs the % operation. Read more
Serialize this value into the given Serde serializer. Read more
The resulting type after applying the - operator.
Performs the - operation. Read more

Auto Trait Implementations§

Blanket Implementations§

Gets the TypeId of self. Read more
Immutably borrows from an owned value. Read more
Mutably borrows from an owned value. Read more
Compare self to key and return true if they are equal.
Checks if this value is equivalent to the given key. Read more

Returns the argument unchanged.

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

The alignment of pointer.
The type for initializers.
Initializes a with the given initializer. Read more
Dereferences the given pointer. Read more
Mutably dereferences the given pointer. Read more
Drops the object pointed to by the given pointer. Read more
The resulting type after obtaining ownership.
Creates owned data from borrowed data, usually by cloning. Read more
Uses borrowed data to replace owned data, usually by cloning. Read more
Converts the given value to a String. Read more
The type returned in the event of a conversion error.
Performs the conversion.
The type returned in the event of a conversion error.
Performs the conversion.