Expand description
Domain specific language for the Lazy API.
This DSL revolves around the Expr
type, which represents an abstract
operation on a DataFrame, such as mapping over a column, filtering, group_by, or aggregation.
In general, functions on LazyFrame
s consume the LazyFrame
and produce a new LazyFrame
representing
the result of applying the function and passed expressions to the consumed LazyFrame.
At runtime, when LazyFrame::collect
is called, the expressions that comprise
the LazyFrame
’s logical plan are materialized on the actual underlying Series.
For instance, let expr = col("x").pow(lit(2)).alias("x2");
would produce an expression representing the abstract
operation of squaring the column "x"
and naming the resulting column "x2"
, and to apply this operation to a
LazyFrame
, you’d use let lazy_df = lazy_df.with_column(expr);
.
(Of course, a column named "x"
must either exist in the original DataFrame or be produced by one of the preceding
operations on the LazyFrame
.)
There are many, many free functions that this module exports that produce an Expr
from scratch; col
and
lit
are two examples.
Expressions also have several methods, such as pow
and alias
, that consume them
and produce a new expression.
Several expressions are only available when the necessary feature is enabled.
Examples of features that unlock specialized expression include string
, temporal
, and dtype-categorical
.
These specialized expressions provide implementations of functions that you’d otherwise have to implement by hand.
Because of how abstract and flexible the Expr
type is, care must be take to ensure you only attempt to perform
sensible operations with them.
For instance, as mentioned above, you have to make sure any columns you reference already exist in the LazyFrame.
Furthermore, there is nothing stopping you from calling, for example, any
with an expression
that will yield an f64
column (instead of bool
), or col("string") - col("f64")
, which would attempt
to subtract an f64
Series from a string
Series.
These kinds of invalid operations will only yield an error at runtime, when
collect
is called on the LazyFrame
.
Re-exports§
pub use functions::*;
Modules§
Structs§
- Array
Name Space - Specialized expressions for
Series
ofDataType::Array
. - Categorical
Name Space - Specialized expressions for Categorical dtypes.
- Chained
Then - Utility struct for the
when-then-otherwise
expression. - Chained
When - Utility struct for the
when-then-otherwise
expression. - Datetime
Args - Arguments used by
datetime
in order to produce anExpr
of Datetime - Duration
Args - Arguments used by
duration
in order to produce anExpr
ofDuration
- Expr
Name Name Space - Specialized expressions for modifying the name of existing expressions.
- Fields
Mapper - Join
Options - List
Name Space - Specialized expressions for
Series
ofDataType::List
. - Meta
Name Space - Specialized expressions for Categorical dtypes.
- Rolling
CovOptions - Special
Eq - Wrapper type that has special equality properties depending on the inner type specialization
- Strptime
Options - Struct
Name Space - Specialized expressions for Struct dtypes.
- Then
- Utility struct for the
when-then-otherwise
expression. - Unpivot
ArgsDSL - User
Defined Function - Represents a user-defined function
- When
- Utility struct for the
when-then-otherwise
expression.
Enums§
- AggExpr
- Boolean
Function - Categorical
Function - Excluded
- Expr
- Expressions that can be used in various contexts.
- Function
Expr - Join
Type OptionsIR - Lazy
Serde - Nested
Type - Operator
- PowFunction
- Selector
- String
Function - Struct
Function - Temporal
Function - Window
Mapping - Window
Type
Traits§
- Binary
UdfOutput Field - Column
Binary Udf - A wrapper trait for any binary closure
Fn(Column, Column) -> PolarsResult<Column>
- Columns
Udf - A wrapper trait for any closure
Fn(Vec<Series>) -> PolarsResult<Series>
- Expr
Eval Extension cumulative_eval
orlist_eval
- Function
Output Field - Into
List Name Space list_eval
- List
Name Space Extension list_eval
- Rename
Alias Fn - UdfSchema
Functions§
- all
- Selects all columns. Shorthand for
col("*")
. - all_
horizontal - Create a new column with the bitwise-and of the elements in each row.
- any_
horizontal - Create a new column with the bitwise-or of the elements in each row.
- apply_
binary - Like
map_binary
, but used in a group_by-aggregation context. - apply_
multiple - Apply a function/closure over the groups of multiple columns. This should only be used in a group_by aggregation.
- arange
- Generate a range of integers.
- arg_
sort_ by range
- Find the indexes that would sort these series in order of appearance.
- arg_
where arg_where
- Get the indices where
condition
evaluatestrue
. - as_
struct - Take several expressions and collect them into a
StructChunked
. - avg
- Find the mean of all the values in the column named
name
. Alias formean
. - binary_
expr - Compute
op(l, r)
(or equivalentlyl op r
).l
andr
must have types compatible with the Operator. - cast
- Casts the column given by
Expr
to a different type. - coalesce
- Folds the expressions from left to right keeping the first non-null values.
- col
- Create a Column Expression based on a column name.
- cols
- Select multiple columns by name.
- concat_
arr - Horizontally concatenate columns into a single array-type column.
- concat_
expr - concat_
list - Concat lists entries.
- concat_
str concat_str
andstrings
- Horizontally concat string columns in linear time
- cov
- Compute the covariance between two columns.
- cum_
fold_ exprs dtype-struct
- Accumulate over multiple columns horizontally / row wise.
- cum_
reduce_ exprs dtype-struct
- Accumulate over multiple columns horizontally / row wise.
- date_
ranges temporal
- Create a column of date ranges from a
start
andstop
expression. - datetime
- Construct a column of
Datetime
from the providedDatetimeArgs
. - datetime_
range dtype-datetime
- Create a datetime range from a
start
andstop
expression. - datetime_
ranges dtype-datetime
- Create a column of datetime ranges from a
start
andstop
expression. - dtype_
col - Select multiple columns by dtype.
- dtype_
cols - Select multiple columns by dtype.
- duration
- Construct a column of
Duration
from the providedDurationArgs
- first
- First column in a DataFrame.
- fold_
exprs - Accumulate over multiple columns horizontally / row wise.
- format_
str concat_str
andstrings
- Format the results of an array of expressions using a format string
- index_
cols - Select multiple columns by index.
- int_
range - Generate a range of integers.
- int_
ranges - Generate a range of integers for each row of the input columns.
- is_
not_ null - A column which is
false
whereverexpr
is null,true
elsewhere. - is_null
- A column which is
true
whereverexpr
is null,false
elsewhere. - last
- Last column in a DataFrame.
- len
- Return the number of rows in the context.
- linear_
space - Generate a series of equally-spaced points.
- lit
- Create a Literal Expression from
L
. A literal expression behaves like a column that contains a single distinct value. - map_
binary - Apply a closure on the two columns that are evaluated from
Expr
a andExpr
b. - map_
list_ multiple - Apply a function/closure over multiple columns once the logical plan get executed.
- map_
multiple - Apply a function/closure over multiple columns once the logical plan get executed.
- max
- Find the maximum of all the values in the column named
name
. Shorthand forcol(name).max()
. - max_
horizontal - Create a new column with the maximum value per row.
- mean
- Find the mean of all the values in the column named
name
. Shorthand forcol(name).mean()
. - mean_
horizontal - Compute the mean of all values horizontally across columns.
- median
- Find the median of all the values in the column named
name
. Shorthand forcol(name).median()
. - min
- Find the minimum of all the values in the column named
name
. Shorthand forcol(name).min()
. - min_
horizontal - Create a new column with the minimum value per row.
- not
- Negates a boolean column.
- nth
- Nth column in a DataFrame.
- pearson_
corr - Compute the pearson correlation between two columns.
- quantile
- Find a specific quantile of all the values in the column named
name
. - reduce_
exprs - Analogous to
Iterator::reduce
. - repeat
- Create a column of length
n
containingn
copies of the literalvalue
. - rolling_
corr rolling_window
andcov
- rolling_
cov rolling_window
andcov
- spearman_
rank_ corr rank
andpropagate_nans
- Compute the spearman rank correlation between two columns. Missing data will be excluded from the computation.
- sum
- Sum all the values in the column named
name
. Shorthand forcol(name).sum()
. - sum_
horizontal - Sum all values horizontally across columns.
- ternary_
expr - time_
ranges dtype-time
- Create a column of time ranges from a
start
andstop
expression. - when
- Start a
when-then-otherwise
expression.
Type Aliases§
- Fields
Name Mapper dtype-struct
- GetOutput
- Opaque
Column Udf