Expand description
Domain specific language for the Lazy API.
This DSL revolves around the Expr
type, which represents an abstract
operation on a DataFrame, such as mapping over a column, filtering, group_by, or aggregation.
In general, functions on LazyFrame
s consume the LazyFrame
and produce a new LazyFrame
representing
the result of applying the function and passed expressions to the consumed LazyFrame.
At runtime, when LazyFrame::collect
is called, the expressions that comprise
the LazyFrame
’s logical plan are materialized on the actual underlying Series.
For instance, let expr = col("x").pow(lit(2)).alias("x2");
would produce an expression representing the abstract
operation of squaring the column "x"
and naming the resulting column "x2"
, and to apply this operation to a
LazyFrame
, you’d use let lazy_df = lazy_df.with_column(expr);
.
(Of course, a column named "x"
must either exist in the original DataFrame or be produced by one of the preceding
operations on the LazyFrame
.)
There are many, many free functions that this module exports that produce an Expr
from scratch; col
and
lit
are two examples.
Expressions also have several methods, such as pow
and alias
, that consume them
and produce a new expression.
Several expressions are only available when the necessary feature is enabled.
Examples of features that unlock specialized expression include string
, temporal
, and dtype-categorical
.
These specialized expressions provide implementations of functions that you’d otherwise have to implement by hand.
Because of how abstract and flexible the Expr
type is, care must be take to ensure you only attempt to perform
sensible operations with them.
For instance, as mentioned above, you have to make sure any columns you reference already exist in the LazyFrame.
Furthermore, there is nothing stopping you from calling, for example, any
with an expression
that will yield an f64
column (instead of bool
), or col("string") - col("f64")
, which would attempt
to subtract an f64
Series from a string
Series.
These kinds of invalid operations will only yield an error at runtime, when
collect
is called on the LazyFrame
.
Re-exports§
pub use functions::*;
Modules§
Structs§
- Specialized expressions for
Series
ofDataType::Array
. - Specialized expressions for Categorical dtypes.
- Utility struct for the
when-then-otherwise
expression. - Utility struct for the
when-then-otherwise
expression. - Arguments used by
datetime
in order to produce anExpr
of Datetime - Specialized expressions for modifying the name of existing expressions.
- Specialized expressions for
Series
ofDataType::List
. - Specialized expressions for Categorical dtypes.
- Wrapper type that has special equality properties depending on the inner type specialization
- Specialized expressions for Struct dtypes.
- Utility struct for the
when-then-otherwise
expression. - Represents a user-defined function
- Utility struct for the
when-then-otherwise
expression.
Enums§
- Expressions that can be used in various contexts.
Traits§
- A wrapper trait for any binary closure
Fn(Column, Column) -> PolarsResult<Column>
- A wrapper trait for any closure
Fn(Vec<Series>) -> PolarsResult<Series>
- Expr
Eval Extension cumulative_eval
orlist_eval
- Into
List Name Space list_eval
- List
Name Space Extension list_eval
Functions§
- Selects all columns. Shorthand for
col("*")
. - Create a new column with the bitwise-and of the elements in each row.
- Create a new column with the bitwise-or of the elements in each row.
- Like
map_binary
, but used in a group_by-aggregation context. - Apply a function/closure over the groups of multiple columns. This should only be used in a group_by aggregation.
- Generate a range of integers.
- arg_
sort_ by range
Find the indexes that would sort these series in order of appearance. - arg_
where arg_where
Get the indices wherecondition
evaluatestrue
. - Take several expressions and collect them into a
StructChunked
. - Find the mean of all the values in the column named
name
. Alias formean
. - Compute
op(l, r)
(or equivalentlyl op r
).l
andr
must have types compatible with the Operator. - Casts the column given by
Expr
to a different type. - Folds the expressions from left to right keeping the first non-null values.
- Create a Column Expression based on a column name.
- Select multiple columns by name.
- Horizontally concatenate columns into a single array-type column.
- Concat lists entries.
- concat_
str concat_str
andstrings
Horizontally concat string columns in linear time - Compute the covariance between two columns.
- cum_
fold_ exprs dtype-struct
Accumulate over multiple columns horizontally / row wise. - cum_
reduce_ exprs dtype-struct
Accumulate over multiple columns horizontally / row wise. - date_
ranges temporal
Create a column of date ranges from astart
andstop
expression. - Construct a column of
Datetime
from the providedDatetimeArgs
. - datetime_
range dtype-datetime
Create a datetime range from astart
andstop
expression. - datetime_
ranges dtype-datetime
Create a column of datetime ranges from astart
andstop
expression. - Select multiple columns by dtype.
- Select multiple columns by dtype.
- Construct a column of
Duration
from the providedDurationArgs
- First column in a DataFrame.
- Accumulate over multiple columns horizontally / row wise.
- format_
str concat_str
andstrings
Format the results of an array of expressions using a format string - Select multiple columns by index.
- Generate a range of integers.
- Generate a range of integers for each row of the input columns.
- A column which is
false
whereverexpr
is null,true
elsewhere. - A column which is
true
whereverexpr
is null,false
elsewhere. - Last column in a DataFrame.
- Return the number of rows in the context.
- Create a Literal Expression from
L
. A literal expression behaves like a column that contains a single distinct value. - Apply a function/closure over multiple columns once the logical plan get executed.
- Apply a function/closure over multiple columns once the logical plan get executed.
- Find the maximum of all the values in the column named
name
. Shorthand forcol(name).max()
. - Create a new column with the maximum value per row.
- Find the mean of all the values in the column named
name
. Shorthand forcol(name).mean()
. - Compute the mean of all values horizontally across columns.
- Find the median of all the values in the column named
name
. Shorthand forcol(name).median()
. - Find the minimum of all the values in the column named
name
. Shorthand forcol(name).min()
. - Create a new column with the minimum value per row.
- Negates a boolean column.
- Nth column in a DataFrame.
- Compute the pearson correlation between two columns.
- Find a specific quantile of all the values in the column named
name
. - Analogous to
Iterator::reduce
. - Create a column of length
n
containingn
copies of the literalvalue
. - rolling_
corr rolling_window
andcov
- rolling_
cov rolling_window
andcov
- spearman_
rank_ corr rank
andpropagate_nans
Compute the spearman rank correlation between two columns. Missing data will be excluded from the computation. - Sum all the values in the column named
name
. Shorthand forcol(name).sum()
. - Sum all values horizontally across columns.
- time_
ranges dtype-time
Create a column of time ranges from astart
andstop
expression. - Start a
when-then-otherwise
expression.
Type Aliases§
- Fields
Name Mapper dtype-struct