Struct arrow_array::array::StructArray
source · pub struct StructArray { /* private fields */ }
Expand description
An array of structs
Each child (called field) is represented by a separate array.
Comparison with RecordBatch
Both RecordBatch
and StructArray
represent a collection of columns / arrays with the
same length.
However, there are a couple of key differences:
StructArray
can be nested within otherArray
, including itselfRecordBatch
can contain top-level metadata on its associatedSchema
StructArray
can contain top-level nulls, i.e.null
RecordBatch
can only represent nulls in its child columns, i.e.{"field": null}
StructArray
is therefore a more general data container than RecordBatch
, and as such
code that needs to handle both will typically share an implementation in terms of
StructArray
and convert to/from RecordBatch
as necessary.
From
implementations are provided to facilitate this conversion, however, converting
from a StructArray
containing top-level nulls to a RecordBatch
will panic, as there
is no way to preserve them.
Example: Create an array from a vector of fields
use std::sync::Arc;
use arrow_array::{Array, ArrayRef, BooleanArray, Int32Array, StructArray};
use arrow_schema::{DataType, Field};
let boolean = Arc::new(BooleanArray::from(vec![false, false, true, true]));
let int = Arc::new(Int32Array::from(vec![42, 28, 19, 31]));
let struct_array = StructArray::from(vec![
(
Arc::new(Field::new("b", DataType::Boolean, false)),
boolean.clone() as ArrayRef,
),
(
Arc::new(Field::new("c", DataType::Int32, false)),
int.clone() as ArrayRef,
),
]);
assert_eq!(struct_array.column(0).as_ref(), boolean.as_ref());
assert_eq!(struct_array.column(1).as_ref(), int.as_ref());
assert_eq!(4, struct_array.len());
assert_eq!(0, struct_array.null_count());
assert_eq!(0, struct_array.offset());
Implementations§
source§impl StructArray
impl StructArray
sourcepub fn new(
fields: Fields,
arrays: Vec<ArrayRef>,
nulls: Option<NullBuffer>
) -> Self
pub fn new( fields: Fields, arrays: Vec<ArrayRef>, nulls: Option<NullBuffer> ) -> Self
Create a new StructArray
from the provided parts, panicking on failure
Panics
Panics if Self::try_new
returns an error
sourcepub fn try_new(
fields: Fields,
arrays: Vec<ArrayRef>,
nulls: Option<NullBuffer>
) -> Result<Self, ArrowError>
pub fn try_new( fields: Fields, arrays: Vec<ArrayRef>, nulls: Option<NullBuffer> ) -> Result<Self, ArrowError>
Create a new StructArray
from the provided parts, returning an error on failure
Errors
Errors if
fields.len() != arrays.len()
fields[i].data_type() != arrays[i].data_type()
arrays[i].len() != arrays[j].len()
arrays[i].len() != nulls.len()
!fields[i].is_nullable() && !nulls.contains(arrays[i].nulls())
sourcepub fn new_null(fields: Fields, len: usize) -> Self
pub fn new_null(fields: Fields, len: usize) -> Self
Create a new StructArray
of length len
where all values are null
sourcepub unsafe fn new_unchecked(
fields: Fields,
arrays: Vec<ArrayRef>,
nulls: Option<NullBuffer>
) -> Self
pub unsafe fn new_unchecked( fields: Fields, arrays: Vec<ArrayRef>, nulls: Option<NullBuffer> ) -> Self
Create a new StructArray
from the provided parts without validation
Safety
Safe if Self::new
would not panic with the given arguments
sourcepub fn into_parts(self) -> (Fields, Vec<ArrayRef>, Option<NullBuffer>)
pub fn into_parts(self) -> (Fields, Vec<ArrayRef>, Option<NullBuffer>)
Deconstruct this array into its constituent parts
sourcepub fn num_columns(&self) -> usize
pub fn num_columns(&self) -> usize
Return the number of fields in this struct array
sourcepub fn columns_ref(&self) -> Vec<ArrayRef>
👎Deprecated: Use columns().to_vec()
pub fn columns_ref(&self) -> Vec<ArrayRef>
Returns child array refs of the struct array
sourcepub fn column_names(&self) -> Vec<&str>
pub fn column_names(&self) -> Vec<&str>
Return field names in this struct array
sourcepub fn fields(&self) -> &Fields
pub fn fields(&self) -> &Fields
Returns the Fields
of this StructArray
sourcepub fn column_by_name(&self, column_name: &str) -> Option<&ArrayRef>
pub fn column_by_name(&self, column_name: &str) -> Option<&ArrayRef>
Return child array whose field name equals to column_name
Note: A schema can currently have duplicate field names, in which case the first field will always be selected. This issue will be addressed in ARROW-11178
Trait Implementations§
source§impl Array for StructArray
impl Array for StructArray
source§fn slice(&self, offset: usize, length: usize) -> ArrayRef
fn slice(&self, offset: usize, length: usize) -> ArrayRef
source§fn offset(&self) -> usize
fn offset(&self) -> usize
0
. Read moresource§fn nulls(&self) -> Option<&NullBuffer>
fn nulls(&self) -> Option<&NullBuffer>
source§fn get_buffer_memory_size(&self) -> usize
fn get_buffer_memory_size(&self) -> usize
source§fn get_array_memory_size(&self) -> usize
fn get_array_memory_size(&self) -> usize
get_buffer_memory_size()
and
includes the overhead of the data structures that contain the pointers to the various buffers.source§fn is_null(&self, index: usize) -> bool
fn is_null(&self, index: usize) -> bool
index
is null.
When using this function on a slice, the index is relative to the slice. Read moresource§fn is_valid(&self, index: usize) -> bool
fn is_valid(&self, index: usize) -> bool
index
is not null.
When using this function on a slice, the index is relative to the slice. Read moresource§fn null_count(&self) -> usize
fn null_count(&self) -> usize
source§impl Clone for StructArray
impl Clone for StructArray
source§fn clone(&self) -> StructArray
fn clone(&self) -> StructArray
1.0.0 · source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source
. Read more