Module prelude Copy item path Source pub use crate::conversion ::*;
_csv_read_internal _internal aggregations arity array binary buffer byte_source cat chunkedarray Traits and utilities for temporal data. cloud Interface with cloud storage through the object_store crate. compression concat_arr cov datatypes Data types supported by Polars. datetime default_arrays dt expr file fill_null fixed_size_list float_sorted_arg_max full function_expr gather interpolate interpolate_by mode nan_propagating_aggregate null peaks pivot Module containing implementation of the pivot operation. prelude python_udf row_encode schema_inference search_sorted sort strings udf utf8 zip df polars_bail polars_ensure polars_err polars_warn AnonymousScanArgs AnonymousScanOptions Arc A thread-safe reference-counting pointer. ‘Arc’ stands for ‘Atomically
Reference Counted’. ArrayNameSpace Specialized expressions for Series
of DataType::Array
. ArrowField Represents Arrow’s metadata of a “column”. AsOfOptions BatchedCsvReader BatchedParquetReader BinaryOffsetType BinaryType BooleanChunkedBuilder BooleanType Bounds BoundsIter BrotliLevel A valid Brotli compression level. CatIter CategoricalChunked CategoricalChunkedBuilder CategoricalNameSpace Specialized expressions for Categorical dtypes. CategoricalType ChainedThen Utility struct for the when-then-otherwise
expression. ChainedWhen Utility struct for the when-then-otherwise
expression. ChunkId ChunkedArray ChunkedArray CompatLevel CrossJoinOptions CsvParseOptions CsvReadOptions CsvReader Create a new DataFrame by reading a csv file. CsvWriter Write a DataFrame to csv. CsvWriterOptions Options for writing CSV files. DataFrame A contiguous growable collection of Series
that have the same length. DateType DatetimeArgs Arguments used by datetime
in order to produce an Expr
of Datetime DatetimeType DecimalType Dimension Duration DurationArgs Arguments used by duration
in order to produce an Expr
of Duration
DurationType DynamicGroupOptions EWMOptions ExprNameNameSpace Specialized expressions for modifying the name of existing expressions. FalseT Field Characterizes the name and the DataType
of a column. FieldsMapper FileMetadata Metadata for a Parquet file. FixedSizeListType Float32Type Float64Type GlobalRevMapMerger GroupBy Returned by a group_by operation on a DataFrame. This struct supports
several aggregations. GroupPositions GroupsIdx Indexes of the groups, the first index is stored separately.
this make sorting fast. GroupsTypeIter GroupsTypeParIter GzipLevel A valid Gzip compression level. InProcessQuery Int8Type Int16Type Int32Type Int64Type Int128Type IpcReadOptions IpcReader Read Arrows IPC format into a DataFrame IpcReaderAsync An Arrow IPC reader implemented on top of PolarsObjectStore. IpcScanOptions IpcStreamReader Read Arrows Stream IPC format into a DataFrame IpcStreamWriter Write a DataFrame to Arrow’s Streaming IPC format IpcStreamWriterOption IpcWriter Write a DataFrame to Arrow’s IPC format IpcWriterOptions JoinArgs JoinBuilder JoinOptions JsonLineReader JsonReader Reads JSON in one of the formats in JsonFormat
into a DataFrame. JsonWriter Writes a DataFrame to JSON. JsonWriterOptions LazyCsvReader LazyFrame Lazy abstraction over an eager DataFrame
. LazyGroupBy Utility struct for lazy group_by operation. LazyJsonLineReader ListBinaryChunkedBuilder ListBooleanChunkedBuilder ListNameSpace Specialized expressions for Series
of DataType::List
. ListPrimitiveChunkedBuilder ListStringChunkedBuilder ListType Logical Maps a logical type to a chunked array implementation of the physical type.
This saves a lot of compiler bloat and allows us to reuse functionality. MetaNameSpace Specialized expressions for Categorical dtypes. NameGenerator NoNull Just a wrapper structure which is useful for certain impl specializations. Null The literal Null NullableIdxSize ObjectType OptFlags Allowed optimizations. OwnedBatchedCsvReader OwnedObject ParquetAsyncReader A Parquet reader on top of the async object_store API. Only the batch reader is implemented since
parquet files on cloud storage tend to be big and slow to access. ParquetOptions ParquetReader Read Apache parquet format into a DataFrame. ParquetWriteOptions ParquetWriter Write a DataFrame to Parquet format. PlSmallStr String type that inlines small strings. PrimitiveChunkedBuilder RankOptions RollingCovOptions RollingGroupOptions RollingOptionsDynamicWindow RollingOptionsFixedWindow RollingQuantileParams RollingVarParams Scalar ScanArgsAnonymous ScanArgsIpc ScanArgsParquet SerializeOptions Options to serialize logical types to CSV. Series Series SortMultipleOptions Sort options for multi-series sorting. SortOptions Options for single series sorting. SpecialEq Wrapper type that has special equality properties
depending on the inner type specialization SplitNChars StatisticsOptions The statistics to write StringCacheHolder Enable the global string cache as long as the object is alive (RAII ). StringType StrptimeOptions StructArray A StructArray
is a nested Array
with an optional validity representing
multiple Array
with the same number of rows. StructNameSpace Specialized expressions for Struct dtypes. StructType Then Utility struct for the when-then-otherwise
expression. TimeType TrueT UInt8Type UInt16Type UInt32Type UInt64Type UnionArgs UnpivotArgsDSL UnpivotArgsIR Arguments for LazyFrame::unpivot
function UserDefinedFunction Represents a user-defined function When Utility struct for the when-then-otherwise
expression. Window Represents a window in time ZstdLevel A valid Zstandard compression level. AggExpr Ambiguous AnyValue ArrowDataType The set of supported logical types in this crate. ArrowTimeUnit The time units defined in Arrow. AsofStrategy BitwiseFunction BooleanFunction CategoricalFunction CategoricalOrdering ClosedInterval ClosedWindow Column A column within a DataFrame
. CommentPrefix CsvEncoding DataType DslPlan Excluded Expr Expressions that can be used in various contexts. FillNullStrategy FunctionExpr GroupByMethod GroupsIndicator GroupsType IndexOrder InterpolationMethod IpcCompression Compression codec JoinCoalesce JoinType JoinTypeOptions JoinTypeOptionsIR JoinValidation JsonFormat The format to use to write the DataFrame to JSON: Json
(a JSON array)
or JsonLines
(each row output on a separate line). Label LazySerde ListToStructArgs ListToStructWidthStrategy LiteralValue MaintainOrderJoin NestedType NonExistent NullStrategy NullValues Operator ParallelStrategy ParquetCompression The compression strategy to use for writing Parquet files. ParquetStatistics Parquet statistics for a nesting level PolarsError PowFunction QuantileMethod QuoteStyle Quote style indicating when to insert quotes around a field. RankMethod ReshapeDimension A dimension in a reshape. RevMapping Roll RollingFnParams SearchSortedSide Selector SetOperation StartBy StringFunction StructFunction TemporalFunction TimeUnit UnicodeForm UniqueKeepStrategy UnknownKind WindowMapping WindowType IDX_DTYPE NULL URL_ENCODE_CHAR_SET BOOLEAN_RE EXTENSION_NAME FLOAT_RE FLOAT_RE_DECIMAL INTEGER_RE POLARS_TEMP_DIR_BASE_PATH AnonymousScan ArgAgg Argmin/ Argmax ArithmeticChunked ArrayCollectIterExt ArrayFromIter ArrayFromIterDtype AsBinary AsList AsRefDataType AsString AsofJoin AsofJoinBy BinaryNameSpaceImpl BinaryUdfOutputField CategoricalMergeOperation ChunkAgg Aggregation operations. ChunkAggSeries Aggregations that return Series
of unit length. Those can be used in broadcasting operations. ChunkAnyValue ChunkApply Fastest way to do elementwise operations on a ChunkedArray<T>
when the operation is cheaper than
branching due to null checking. ChunkApplyKernel Apply kernels on the arrow array chunks in a ChunkedArray. ChunkApproxNUnique ChunkBitwiseReduce Bitwise Reduction Operations. ChunkBytes ChunkCast Cast ChunkedArray<T>
to ChunkedArray<N>
ChunkCompareEq Compare Series
and ChunkedArray
’s and get a boolean
mask that
can be used to filter rows. ChunkCompareIneq Compare Series
and ChunkedArray
’s using inequality operators (<
, >=
, etc.) and get
a boolean
mask that can be used to filter rows. ChunkExpandAtIndex Create a new ChunkedArray filled with values at that index. ChunkExplode Explode/flatten a List or String Series ChunkFillNullValue Replace None values with a value ChunkFilter Filter values by a boolean mask. ChunkFull Fill a ChunkedArray with one value. ChunkFullNull ChunkQuantile Quantile and median aggregation. ChunkReverse Reverse a ChunkedArray<T>
ChunkRollApply This differs from ChunkWindowCustom and ChunkWindow
by not using a fold aggregator, but reusing a Series
wrapper and calling Series
aggregators.
This likely is a bit slower than ChunkWindow ChunkSet Create a ChunkedArray
with new values by index or by boolean mask. ChunkShift ChunkShiftFill Shift the values of a ChunkedArray
by a number of periods. ChunkSort Sort operations on ChunkedArray
. ChunkTake ChunkTakeUnchecked ChunkUnique Get unique values in a ChunkedArray
ChunkVar Variance and standard deviation aggregation. ChunkZip Combine two ChunkedArray
based on some predicate. ChunkedBuilder ChunkedCollectInferIterExt ChunkedCollectIterExt ChunkedSet ColumnBinaryUdf A wrapper trait for any binary closure Fn(Column, Column) -> PolarsResult<Column>
ColumnsUdf A wrapper trait for any closure Fn(Vec<Series>) -> PolarsResult<Series>
CrossJoin CrossJoinFilter DataFrameJoinOps DataFrameOps DateMethods DatetimeMethods DurationMethods ExprEvalExtension FromData FromDataBinary FromDataUtf8 FunctionOutputField GetAnyValue IndexToUsize InitHashMaps InitHashMaps2 IntoColumn Convert Self
into a Column
IntoGroupsType Used to create the tuples for a group_by operation. IntoLazy IntoListNameSpace IntoMetadata IntoScalar IntoSeries Used to convert a ChunkedArray
, &dyn SeriesTrait
and Series
into a Series
. IntoVec Convenience for x.into_iter().map(Into::into).collect()
using an into_vec()
function. IsFirstDistinct Mask the first unique values as true
IsLastDistinct Mask the last unique values as true
JoinDispatch LazyFileListReader Reads LazyFrame from a filesystem or a cloud storage.
Supports glob patterns. LhsNumOps ListBuilderTrait ListFromIter ListNameSpaceExtension ListNameSpaceImpl Literal LogSeries LogicalType MetaDataExt MinMaxHorizontal MomentSeries NamedFrom NamedFromOwned NewChunkedArray NumOpsDispatch NumericNative PolarsDataType Safety PolarsFloatType PolarsIntegerType PolarsIterator A PolarsIterator
is an iterator over a ChunkedArray
which contains polars types. A PolarsIterator
must implement ExactSizeIterator
and DoubleEndedIterator
. PolarsMonthEnd PolarsMonthStart PolarsNumericType PolarsObject Values need to implement this so that they can be stored into a Series and DataFrame PolarsRound PolarsTemporalGroupby PolarsTruncate PolarsUpsample QuantileAggSeries Reinterpret RenameAliasFn RollingSeries RoundSeries SchemaExt SchemaExtPl SchemaNamesAndDtypes SerReader SerWriter SeriesJoin SeriesMethods SeriesOpsTime SeriesRank SeriesSealed SeriesTrait SlicedArray Utility trait to slice concrete arrow arrays whilst keeping their
concrete type. E.g. don’t return Box<dyn Array>
. StaticArray StringMethods StringNameSpaceImpl SumMeanHorizontal TakeChunked Gather by ChunkId
TakeChunkedHorPar TemporalMethods TimeMethods ToDummies ToStruct UdfSchema UnpivotDF Utf8JsonPathImpl VarAggSeries VecHash _coalesce_full_join _default_struct_name_gen _join_suffix_name _merge_sorted_dfs _set_check_length ⚠ Meant for internal use. In very rare conditions this can be turned off. abs Convert numerical values to their absolute value. add_business_days Add a given number of business days. all Selects all columns. Shorthand for col("*")
. all_horizontal Create a new column with the bitwise-and of the elements in each row. any_horizontal Create a new column with the bitwise-or of the elements in each row. apply_binary Like map_binary
, but used in a group_by-aggregation context. apply_multiple Apply a function/closure over the groups of multiple columns. This should only be used in a group_by aggregation. apply_projection arange Generate a range of integers. arg_sort_by Find the indexes that would sort these series in order of appearance. arg_where Get the indices where condition
evaluates true
. as_struct Take several expressions and collect them into a StructChunked
. avg Find the mean of all the values in the column named name
. Alias for mean
. base_utc_offset binary_expr Compute op(l, r)
(or equivalently l op r
). l
and r
must have types compatible with the Operator. call_categorical_merge_operation cast Casts the column given by Expr
to a different type. clip Set values outside the given boundaries to the boundary value. clip_max Set values above the given maximum to the maximum value. clip_min Set values below the given minimum to the minimum value. coalesce Folds the expressions from left to right keeping the first non-null values. coalesce_columns col Create a Column Expression based on a column name. collect_all Collect all LazyFrame
computations. cols Select multiple columns by name. columns_to_projection compute_labels concat Concat multiple LazyFrame
s vertically. concat_arr Horizontally concatenate columns into a single array-type column. concat_expr concat_lf_diagonal Concat LazyFrame s diagonally.
Calls concat
internally. concat_lf_horizontal Concat LazyFrame s horizontally. concat_list Concat lists entries. concat_str Horizontally concat string columns in linear time contains_any convert_inner_type Cast null arrays to inner type and ensure that all offsets remain correct convert_to_unsigned_index count_ones count_rows Read the number of rows without parsing columns
useful for count(*) queries count_rows_from_slice Read the number of rows without parsing columns
useful for count(*) queries count_zeros cov Compute the covariance between two columns. create_enum_dtype create_sorting_map cum_count cum_fold_exprs Accumulate over multiple columns horizontally / row wise. cum_max Get an array with the cumulative max computed at every element. cum_min Get an array with the cumulative min computed at every element. cum_prod Get an array with the cumulative product computed at every element. cum_reduce_exprs Accumulate over multiple columns horizontally / row wise. cum_sum Get an array with the cumulative sum computed at every element cut date_ranges Create a column of date ranges from a start
and stop
expression. datetime Construct a column of Datetime
from the provided DatetimeArgs
. datetime_range Create a datetime range from a start
and stop
expression. datetime_ranges Create a column of datetime ranges from a start
and stop
expression. datetime_to_timestamp_ms datetime_to_timestamp_ns datetime_to_timestamp_us decode_json_response Utility for decoding JSON that adds the response value to the error message if decoding fails.
This makes it much easier to debug errors from parsing network responses. default_join_ids deserialize Deserializes the statistics in the column chunks from a single row_group
into Statistics
associated from field
’s name. diff dst_offset dtype_col Select multiple columns by dtype. dtype_cols Select multiple columns by dtype. duration Construct a column of Duration
from the provided DurationArgs
ensure_duration_matches_dtype ensure_is_constant_duration ensure_matching_schema escape_regex escape_regex_str ewm_mean ewm_mean_by ewm_std ewm_var expand_paths Recursively traverses directories and expands globs if glob
is true
. expand_paths_hive Recursively traverses directories and expands globs if glob
is true
.
Returns the expanded paths and the index at which to start parsing hive
partitions from the path. expanded_from_single_directory Returns true
if expanded_paths
were expanded from a single directory extract_json extract_many find_many first First column in a DataFrame. floor_div_series fma_columns fms_columns fmt_group_by_column fold_exprs Accumulate over multiple columns horizontally / row wise. format_str Format the results of an array of expressions using a format string fsm_columns get_encodings get_glob_start_idx Get the index of the first occurrence of a glob symbol. get_reader_bytes get_strftime_format group_by_values Different from group_by_windows
, where define window buckets and search which values fit that
pre-defined bucket. group_by_windows Window boundaries are created based on the given Window
, which is defined by: hist_series hor_str_concat Horizontally concatenate all strings. impl_duration impl_offset_by impl_replace_time_zone impl_replace_time_zone_fast If ambiguous
is length-1 and not equal to “null”, we can take a slightly faster path. in_nanoseconds_window index_cols Select multiple columns by index. index_of Find the index of a given value (the first and only entry in value_series
)
within the series. indexes_to_usizes infer_file_schema Infer the schema of a CSV file by reading through the first n rows of the file,
with max_read_rows
controlling the maximum number of rows to read. infer_schema Infers a ArrowSchema
from parquet’s FileMetadata
. int_range Generate a range of integers. int_ranges Generate a range of integers for each row of the input columns. interpolate interpolate_by is_between is_cloud_url Check if the path is a cloud url. is_duplicated is_first_distinct is_in is_last_distinct is_not_null A column which is false
wherever expr
is null, true
elsewhere. is_null A column which is true
wherever expr
is null, false
elsewhere. is_positive_idx_uncertain May give false negatives because it ignores the null values. is_positive_idx_uncertain_col May give false negatives because it ignores the null values. is_unique last Last column in a DataFrame. leading_ones leading_zeros len Return the number of rows in the context. linear_space Generate a series of equally-spaced points. list_count_matches list_set_operation lit Create a Literal Expression from L
. A literal expression behaves like a column that contains a single distinct
value. make_categoricals_compatible make_list_categoricals_compatible map_binary Apply a closure on the two columns that are evaluated from Expr
a and Expr
b. map_list_multiple Apply a function/closure over multiple columns once the logical plan get executed. map_multiple Apply a function/closure over multiple columns once the logical plan get executed. materialize_empty_df materialize_projection max Find the maximum of all the values in the column named name
. Shorthand for col(name).max()
. mean Find the mean of all the values in the column named name
. Shorthand for col(name).mean()
. median Find the median of all the values in the column named name
. Shorthand for col(name).median()
. merge_dtypes min Find the minimum of all the values in the column named name
. Shorthand for col(name).min()
. negate negate_bitwise new_int_range new_linear_space_f32 new_linear_space_f64 normalize normalize_with not Negates a boolean column. nth Nth column in a DataFrame. overwrite_schema pct_change pearson_corr Compute the pearson correlation between two columns. prepare_cloud_plan Prepare the given DslPlan
for execution on Polars Cloud. private_left_join_multiple_keys qcut quantile Find a specific quantile of all the values in the column named name
. reduce_exprs Analogous to Iterator::reduce
. reinterpret remove_bom repeat Create a column of length n
containing n
copies of the literal value
. repeat_by replace Replace values by different values of the same data type. replace_all replace_date Replace specific time component of a DateChunked
with a specified value. replace_datetime Replace specific time component of a DatetimeChunked
with a specified value. replace_or_default Replace all values by different values. replace_strict Replace all values by different values. replace_time_zone resolve_homedir Replaces a “~” in the Path with the home directory. rle Get the lengths of runs of identical values. rle_id Similar to rle
, but maps values to run IDs. rolling_corr rolling_cov search_sorted select_json Returns a string of the most specific value given the compiled JSON path expression.
This avoids creating a list to represent individual elements so that they can be
selected directly. spearman_rank_corr Compute the spearman rank correlation between two columns.
Missing data will be excluded from the computation. split_helper split_to_struct str_join strip_chars strip_chars_end strip_chars_start strip_prefix strip_suffix sum Sum all the values in the column named name
. Shorthand for col(name).sum()
. ternary_expr time_ranges Create a column of time ranges from a start
and stop
expression. top_k top_k_by trailing_ones trailing_zeros try_set_sorted_flag unique_counts Returns a count of the unique values in the order of appearance. when Start a when-then-otherwise
expression. write_partitioned_dataset Write a partitioned parquet dataset. This functionality is unstable. AllowedOptimizations AllowedOptimizations ArrayChunked ArrayRef ArrowSchema An ordered sequence of Field
s BinaryChunked BinaryChunkedBuilder BinaryOffsetChunked BooleanChunked BorrowIdxItem ChunkJoinOptIds DateChunked DatetimeChunked DecimalChunked DurationChunked FieldRef FieldsNameMapper FileMetadataRef FillNullLimit Float32Chunked Float64Chunked GetOutput GroupsSlice Every group is indicated by an array where the IdxArr IdxCa IdxItem IdxSize IdxType InnerJoinIds Int8Chunked Int16Chunked Int32Chunked Int64Chunked Int128Chunked LargeBinaryArray LargeListArray LargeStringArray LeftJoinIds ListChunked ObjectChunked OpaqueColumnUdf PlHashMap PlHashSet PlIdHashMap This hashmap uses an IdHasher PlIndexMap PlIndexSet PlRandomState PolarsResult QuantileInterpolOptions Deprecated RowGroupIterColumns Schema SchemaRef StringChunked StringChunkedBuilder StructChunked TimeChunked TimeZone UInt8Chunked UInt16Chunked UInt32Chunked UInt64Chunked