Struct arrow_array::array::UnionArray

source ·
pub struct UnionArray { /* private fields */ }
Expand description

An array of values of varying types

Each slot in a UnionArray can have a value chosen from a number of types. Each of the possible types are named like the fields of a StructArray. A UnionArray can have two possible memory layouts, “dense” or “sparse”. For more information on please see the specification.

UnionBuilder can be used to create UnionArray’s of primitive types. UnionArray’s of nested types are also supported but not via UnionBuilder, see the tests for examples.

§Examples

§Create a dense UnionArray [1, 3.2, 34]

use arrow_buffer::ScalarBuffer;
use arrow_schema::*;
use std::sync::Arc;
use arrow_array::{Array, Int32Array, Float64Array, UnionArray};

let int_array = Int32Array::from(vec![1, 34]);
let float_array = Float64Array::from(vec![3.2]);
let type_ids = [0, 1, 0].into_iter().collect::<ScalarBuffer<i8>>();
let offsets = [0, 0, 1].into_iter().collect::<ScalarBuffer<i32>>();

let union_fields = [
    (0, Arc::new(Field::new("A", DataType::Int32, false))),
    (1, Arc::new(Field::new("B", DataType::Float64, false))),
].into_iter().collect::<UnionFields>();

let children = vec![
    Arc::new(int_array) as Arc<dyn Array>,
    Arc::new(float_array),
];

let array = UnionArray::try_new(
    union_fields,
    type_ids,
    Some(offsets),
    children,
).unwrap();

let value = array.value(0).as_any().downcast_ref::<Int32Array>().unwrap().value(0);
assert_eq!(1, value);

let value = array.value(1).as_any().downcast_ref::<Float64Array>().unwrap().value(0);
assert!(3.2 - value < f64::EPSILON);

let value = array.value(2).as_any().downcast_ref::<Int32Array>().unwrap().value(0);
assert_eq!(34, value);

§Create a sparse UnionArray [1, 3.2, 34]

use arrow_buffer::ScalarBuffer;
use arrow_schema::*;
use std::sync::Arc;
use arrow_array::{Array, Int32Array, Float64Array, UnionArray};

let int_array = Int32Array::from(vec![Some(1), None, Some(34)]);
let float_array = Float64Array::from(vec![None, Some(3.2), None]);
let type_ids = [0_i8, 1, 0].into_iter().collect::<ScalarBuffer<i8>>();

let union_fields = [
    (0, Arc::new(Field::new("A", DataType::Int32, false))),
    (1, Arc::new(Field::new("B", DataType::Float64, false))),
].into_iter().collect::<UnionFields>();

let children = vec![
    Arc::new(int_array) as Arc<dyn Array>,
    Arc::new(float_array),
];

let array = UnionArray::try_new(
    union_fields,
    type_ids,
    None,
    children,
).unwrap();

let value = array.value(0).as_any().downcast_ref::<Int32Array>().unwrap().value(0);
assert_eq!(1, value);

let value = array.value(1).as_any().downcast_ref::<Float64Array>().unwrap().value(0);
assert!(3.2 - value < f64::EPSILON);

let value = array.value(2).as_any().downcast_ref::<Int32Array>().unwrap().value(0);
assert_eq!(34, value);

Implementations§

source§

impl UnionArray

source

pub unsafe fn new_unchecked( fields: UnionFields, type_ids: ScalarBuffer<i8>, offsets: Option<ScalarBuffer<i32>>, children: Vec<ArrayRef>, ) -> Self

Creates a new UnionArray.

Accepts type ids, child arrays and optionally offsets (for dense unions) to create a new UnionArray. This method makes no attempt to validate the data provided by the caller and assumes that each of the components are correct and consistent with each other. See try_new for an alternative that validates the data provided.

§Safety

The type_ids values should be positive and must match one of the type ids of the fields provided in fields. These values are used to index into the children arrays.

The offsets is provided in the case of a dense union, sparse unions should use None. If provided the offsets values should be positive and must be less than the length of the corresponding array.

In both cases above we use signed integer types to maintain compatibility with other Arrow implementations.

source

pub fn try_new( fields: UnionFields, type_ids: ScalarBuffer<i8>, offsets: Option<ScalarBuffer<i32>>, children: Vec<ArrayRef>, ) -> Result<Self, ArrowError>

Attempts to create a new UnionArray, validating the inputs provided.

The order of child arrays child array order must match the fields order

source

pub fn child(&self, type_id: i8) -> &ArrayRef

Accesses the child array for type_id.

§Panics

Panics if the type_id provided is not present in the array’s DataType in the Union.

source

pub fn type_id(&self, index: usize) -> i8

Returns the type_id for the array slot at index.

§Panics

Panics if index is greater than or equal to the number of child arrays

source

pub fn type_ids(&self) -> &ScalarBuffer<i8>

Returns the type_ids buffer for this array

source

pub fn offsets(&self) -> Option<&ScalarBuffer<i32>>

Returns the offsets buffer if this is a dense array

source

pub fn value_offset(&self, index: usize) -> usize

Returns the offset into the underlying values array for the array slot at index.

§Panics

Panics if index is greater than or equal the length of the array.

source

pub fn value(&self, i: usize) -> ArrayRef

Returns the array’s value at index i.

§Panics

Panics if index i is out of bounds

source

pub fn type_names(&self) -> Vec<&str>

Returns the names of the types in the union.

source

pub fn slice(&self, offset: usize, length: usize) -> Self

Returns a zero-copy slice of this array with the indicated offset and length.

source

pub fn into_parts( self, ) -> (UnionFields, ScalarBuffer<i8>, Option<ScalarBuffer<i32>>, Vec<ArrayRef>)

Deconstruct this array into its constituent parts

§Example
let mut builder = UnionBuilder::new_dense();
builder.append::<Int32Type>("a", 1).unwrap();
let union_array = builder.build()?;

// Deconstruct into parts
let (union_fields, type_ids, offsets, children) = union_array.into_parts();

// Reconstruct from parts
let union_array = UnionArray::try_new(
    union_fields,
    type_ids,
    offsets,
    children,
);

Trait Implementations§

source§

impl Array for UnionArray

source§

fn is_null(&self, _index: usize) -> bool

Union types always return non null as there is no validity buffer. To check validity correctly you must check the underlying vector.

source§

fn is_valid(&self, _index: usize) -> bool

Union types always return non null as there is no validity buffer. To check validity correctly you must check the underlying vector.

source§

fn null_count(&self) -> usize

Union types always return 0 null count as there is no validity buffer. To get null count correctly you must check the underlying vector.

source§

fn as_any(&self) -> &dyn Any

Returns the array as Any so that it can be downcasted to a specific implementation. Read more
source§

fn to_data(&self) -> ArrayData

Returns the underlying data of this array
source§

fn into_data(self) -> ArrayData

Returns the underlying data of this array Read more
source§

fn data_type(&self) -> &DataType

Returns a reference to the DataType of this array. Read more
source§

fn slice(&self, offset: usize, length: usize) -> ArrayRef

Returns a zero-copy slice of this array with the indicated offset and length. Read more
source§

fn len(&self) -> usize

Returns the length (i.e., number of elements) of this array. Read more
source§

fn is_empty(&self) -> bool

Returns whether this array is empty. Read more
source§

fn offset(&self) -> usize

Returns the offset into the underlying data used by this array(-slice). Note that the underlying data can be shared by many arrays. This defaults to 0. Read more
source§

fn nulls(&self) -> Option<&NullBuffer>

Returns the null buffer of this array if any. Read more
source§

fn get_buffer_memory_size(&self) -> usize

Returns the total number of bytes of memory pointed to by this array. The buffers store bytes in the Arrow memory format, and include the data as well as the validity map. Note that this does not always correspond to the exact memory usage of an array, since multiple arrays can share the same buffers or slices thereof.
source§

fn get_array_memory_size(&self) -> usize

Returns the total number of bytes of memory occupied physically by this array. This value will always be greater than returned by get_buffer_memory_size() and includes the overhead of the data structures that contain the pointers to the various buffers.
source§

fn logical_nulls(&self) -> Option<NullBuffer>

Returns a potentially computed NullBuffer that represents the logical null values of this array, if any. Read more
source§

fn is_nullable(&self) -> bool

Returns false if the array is guaranteed to not contain any logical nulls Read more
source§

impl Clone for UnionArray

source§

fn clone(&self) -> UnionArray

Returns a copy of the value. Read more
1.0.0 · source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
source§

impl Debug for UnionArray

source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
source§

impl From<ArrayData> for UnionArray

source§

fn from(data: ArrayData) -> Self

Converts to this type from the input type.
source§

impl From<UnionArray> for ArrayData

source§

fn from(array: UnionArray) -> Self

Converts to this type from the input type.

Auto Trait Implementations§

Blanket Implementations§

source§

impl<T> Any for T
where T: 'static + ?Sized,

source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
source§

impl<T> Borrow<T> for T
where T: ?Sized,

source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
source§

impl<T> CloneToUninit for T
where T: Clone,

source§

default unsafe fn clone_to_uninit(&self, dst: *mut T)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dst. Read more
source§

impl<T> Datum for T
where T: Array,

source§

fn get(&self) -> (&dyn Array, bool)

Returns the value for this Datum and a boolean indicating if the value is scalar
source§

impl<T> From<T> for T

source§

fn from(t: T) -> T

Returns the argument unchanged.

source§

impl<T, U> Into<U> for T
where U: From<T>,

source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

source§

impl<T> ToOwned for T
where T: Clone,

§

type Owned = T

The resulting type after obtaining ownership.
source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

§

type Error = Infallible

The type returned in the event of a conversion error.
source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.