Enum polars_arrow::datatypes::ArrowDataType
source · pub enum ArrowDataType {
Show 38 variants
Null,
Boolean,
Int8,
Int16,
Int32,
Int64,
UInt8,
UInt16,
UInt32,
UInt64,
Float16,
Float32,
Float64,
Timestamp(TimeUnit, Option<String>),
Date32,
Date64,
Time32(TimeUnit),
Time64(TimeUnit),
Duration(TimeUnit),
Interval(IntervalUnit),
Binary,
FixedSizeBinary(usize),
LargeBinary,
Utf8,
LargeUtf8,
List(Box<Field>),
FixedSizeList(Box<Field>, usize),
LargeList(Box<Field>),
Struct(Vec<Field>),
Union(Vec<Field>, Option<Vec<i32>>, UnionMode),
Map(Box<Field>, bool),
Dictionary(IntegerType, Box<ArrowDataType>, bool),
Decimal(usize, usize),
Decimal256(usize, usize),
Extension(String, Box<ArrowDataType>, Option<String>),
BinaryView,
Utf8View,
Unknown,
}
Expand description
The set of supported logical types in this crate.
Each variant uniquely identifies a logical type, which define specific semantics to the data
(e.g. how it should be represented).
Each variant has a corresponding PhysicalType
, obtained via ArrowDataType::to_physical_type
,
which declares the in-memory representation of data.
The ArrowDataType::Extension
is special in that it augments a ArrowDataType
with metadata to support custom types.
Use to_logical_type
to desugar such type and return its corresponding logical type.
Variants§
Null
Null type
Boolean
true
and false
.
Int8
An i8
Int16
An i16
Int32
An i32
Int64
An i64
UInt8
An u8
UInt16
An u16
UInt32
An u32
UInt64
An u64
Float16
An 16-bit float
Float32
A f32
Float64
A f64
Timestamp(TimeUnit, Option<String>)
A i64
representing a timestamp measured in TimeUnit
with an optional timezone.
Time is measured as a Unix epoch, counting the seconds from 00:00:00.000 on 1 January 1970, excluding leap seconds, as a 64-bit signed integer.
The time zone is a string indicating the name of a time zone, one of:
- As used in the Olson time zone database (the “tz database” or “tzdata”), such as “America/New_York”
- An absolute time zone offset of the form +XX:XX or -XX:XX, such as +07:30
When the timezone is not specified, the timestamp is considered to have no timezone and is represented as is
Date32
An i32
representing the elapsed time since UNIX epoch (1970-01-01)
in days.
Date64
An i64
representing the elapsed time since UNIX epoch (1970-01-01)
in milliseconds. Values are evenly divisible by 86400000.
Time32(TimeUnit)
A 32-bit time representing the elapsed time since midnight in the unit of TimeUnit
.
Only TimeUnit::Second
and TimeUnit::Millisecond
are supported on this variant.
Time64(TimeUnit)
A 64-bit time representing the elapsed time since midnight in the unit of TimeUnit
.
Only TimeUnit::Microsecond
and TimeUnit::Nanosecond
are supported on this variant.
Duration(TimeUnit)
Measure of elapsed time. This elapsed time is a physical duration (i.e. 1s as defined in S.I.)
Interval(IntervalUnit)
A “calendar” interval modeling elapsed time that takes into account calendar shifts. For example an interval of 1 day may represent more than 24 hours.
Binary
Opaque binary data of variable length whose offsets are represented as i32
.
FixedSizeBinary(usize)
Opaque binary data of fixed size. Enum parameter specifies the number of bytes per value.
LargeBinary
Opaque binary data of variable length whose offsets are represented as i64
.
Utf8
A variable-length UTF-8 encoded string whose offsets are represented as i32
.
LargeUtf8
A variable-length UTF-8 encoded string whose offsets are represented as i64
.
List(Box<Field>)
A list of some logical data type whose offsets are represented as i32
.
FixedSizeList(Box<Field>, usize)
A list of some logical data type with a fixed number of elements.
LargeList(Box<Field>)
A list of some logical data type whose offsets are represented as i64
.
Struct(Vec<Field>)
A nested ArrowDataType
with a given number of Field
s.
Union(Vec<Field>, Option<Vec<i32>>, UnionMode)
A nested datatype that can represent slots of differing types. Third argument represents mode
Map(Box<Field>, bool)
A nested type that is represented as
List<entries: Struct<key: K, value: V>>
In this layout, the keys and values are each respectively contiguous. We do not constrain the key and value types, so the application is responsible for ensuring that the keys are hashable and unique. Whether the keys are sorted may be set in the metadata for this field.
In a field with Map type, the field has a child Struct field, which then has two children: key type and the second the value type. The names of the child fields may be respectively “entries”, “key”, and “value”, but this is not enforced.
Map
- child[0] entries: Struct
- child[0] key: K
- child[1] value: V
Neither the “entries” field nor the “key” field may be nullable.
The metadata is structured so that Arrow systems without special handling for Map can make Map an alias for List. The “layout” attribute for the Map field must have the same contents as a List.
- Field
- ordered
Dictionary(IntegerType, Box<ArrowDataType>, bool)
A dictionary encoded array (key_type
, value_type
), where
each array element is an index of key_type
into an
associated dictionary of value_type
.
Dictionary arrays are used to store columns of value_type
that contain many repeated values using less memory, but with
a higher CPU overhead for some operations.
This type mostly used to represent low cardinality string arrays or a limited set of primitive types as integers.
The bool
value indicates the Dictionary
is sorted if set to true
.
Decimal(usize, usize)
Decimal value with precision and scale precision is the number of digits in the number and scale is the number of decimal places. The number 999.99 has a precision of 5 and scale of 2.
Decimal256(usize, usize)
Decimal backed by 256 bits
Extension(String, Box<ArrowDataType>, Option<String>)
Extension type.
- name
- physical type
- metadata
BinaryView
A binary type that inlines small values and can intern bytes.
Utf8View
A string type that inlines small values and can intern strings.
Unknown
A type unknown to Arrow.
Implementations§
source§impl ArrowDataType
impl ArrowDataType
sourcepub fn to_physical_type(&self) -> PhysicalType
pub fn to_physical_type(&self) -> PhysicalType
the PhysicalType
of this ArrowDataType
.
pub fn underlying_physical_type(&self) -> ArrowDataType
sourcepub fn to_logical_type(&self) -> &ArrowDataType
pub fn to_logical_type(&self) -> &ArrowDataType
Returns &self
for all but ArrowDataType::Extension
. For ArrowDataType::Extension
,
(recursively) returns the inner ArrowDataType
.
Never returns the variant ArrowDataType::Extension
.
pub fn inner_dtype(&self) -> Option<&ArrowDataType>
pub fn is_view(&self) -> bool
Trait Implementations§
source§impl Clone for ArrowDataType
impl Clone for ArrowDataType
source§fn clone(&self) -> ArrowDataType
fn clone(&self) -> ArrowDataType
1.0.0 · source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source
. Read moresource§impl Debug for ArrowDataType
impl Debug for ArrowDataType
source§impl Default for ArrowDataType
impl Default for ArrowDataType
source§fn default() -> ArrowDataType
fn default() -> ArrowDataType
source§impl<'de> Deserialize<'de> for ArrowDataType
impl<'de> Deserialize<'de> for ArrowDataType
source§fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
source§impl From<ArrowDataType> for DataType
impl From<ArrowDataType> for DataType
source§fn from(value: ArrowDataType) -> Self
fn from(value: ArrowDataType) -> Self
source§impl<T: NativeType> From<ArrowDataType> for MutablePrimitiveArray<T>
impl<T: NativeType> From<ArrowDataType> for MutablePrimitiveArray<T>
source§fn from(data_type: ArrowDataType) -> Self
fn from(data_type: ArrowDataType) -> Self
source§impl From<DataType> for ArrowDataType
impl From<DataType> for ArrowDataType
source§impl From<IntegerType> for ArrowDataType
impl From<IntegerType> for ArrowDataType
source§fn from(item: IntegerType) -> Self
fn from(item: IntegerType) -> Self
source§impl From<PrimitiveType> for ArrowDataType
impl From<PrimitiveType> for ArrowDataType
source§fn from(item: PrimitiveType) -> Self
fn from(item: PrimitiveType) -> Self
source§impl Hash for ArrowDataType
impl Hash for ArrowDataType
source§impl PartialEq for ArrowDataType
impl PartialEq for ArrowDataType
source§fn eq(&self, other: &ArrowDataType) -> bool
fn eq(&self, other: &ArrowDataType) -> bool
self
and other
values to be equal, and is used
by ==
.source§impl Serialize for ArrowDataType
impl Serialize for ArrowDataType
impl Eq for ArrowDataType
impl StructuralPartialEq for ArrowDataType
Auto Trait Implementations§
impl Freeze for ArrowDataType
impl RefUnwindSafe for ArrowDataType
impl Send for ArrowDataType
impl Sync for ArrowDataType
impl Unpin for ArrowDataType
impl UnwindSafe for ArrowDataType
Blanket Implementations§
source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
source§default unsafe fn clone_to_uninit(&self, dst: *mut T)
default unsafe fn clone_to_uninit(&self, dst: *mut T)
clone_to_uninit
)source§impl<Q, K> Equivalent<K> for Q
impl<Q, K> Equivalent<K> for Q
source§impl<Q, K> Equivalent<K> for Q
impl<Q, K> Equivalent<K> for Q
source§fn equivalent(&self, key: &K) -> bool
fn equivalent(&self, key: &K) -> bool
key
and return true
if they are equal.source§impl<T> IntoEither for T
impl<T> IntoEither for T
source§fn into_either(self, into_left: bool) -> Either<Self, Self> ⓘ
fn into_either(self, into_left: bool) -> Either<Self, Self> ⓘ
self
into a Left
variant of Either<Self, Self>
if into_left
is true
.
Converts self
into a Right
variant of Either<Self, Self>
otherwise. Read moresource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self> ⓘ
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self> ⓘ
self
into a Left
variant of Either<Self, Self>
if into_left(&self)
returns true
.
Converts self
into a Right
variant of Either<Self, Self>
otherwise. Read more