Struct arrow_array::array::GenericByteArray
source · pub struct GenericByteArray<T: ByteArrayType> { /* private fields */ }
Expand description
An array of variable length byte arrays
See StringArray
and LargeStringArray
for storing utf8 encoded string data
See BinaryArray
and LargeBinaryArray
for storing arbitrary bytes
Example: From a Vec
let arr: GenericByteArray<Utf8Type> = vec!["hello", "world", ""].into();
assert_eq!(arr.value_data(), b"helloworld");
assert_eq!(arr.value_offsets(), &[0, 5, 10, 10]);
let values: Vec<_> = arr.iter().collect();
assert_eq!(values, &[Some("hello"), Some("world"), Some("")]);
Example: From an optional Vec
let arr: GenericByteArray<Utf8Type> = vec![Some("hello"), Some("world"), Some(""), None].into();
assert_eq!(arr.value_data(), b"helloworld");
assert_eq!(arr.value_offsets(), &[0, 5, 10, 10, 10]);
let values: Vec<_> = arr.iter().collect();
assert_eq!(values, &[Some("hello"), Some("world"), Some(""), None]);
Example: From an iterator of option
let arr: GenericByteArray<Utf8Type> = (0..5).map(|x| (x % 2 == 0).then(|| x.to_string())).collect();
let values: Vec<_> = arr.iter().collect();
assert_eq!(values, &[Some("0"), None, Some("2"), None, Some("4")]);
Example: Using Builder
let mut builder = GenericByteBuilder::<Utf8Type>::new();
builder.append_value("hello");
builder.append_null();
builder.append_value("world");
let array = builder.finish();
let values: Vec<_> = array.iter().collect();
assert_eq!(values, &[Some("hello"), None, Some("world")]);
Implementations§
source§impl<OffsetSize: OffsetSizeTrait> GenericByteArray<GenericBinaryType<OffsetSize>>
impl<OffsetSize: OffsetSizeTrait> GenericByteArray<GenericBinaryType<OffsetSize>>
sourcepub const fn get_data_type() -> DataType
👎Deprecated: please use Self::DATA_TYPE
instead
pub const fn get_data_type() -> DataType
Self::DATA_TYPE
insteadGet the data type of the array.
sourcepub fn from_vec(v: Vec<&[u8]>) -> Self
pub fn from_vec(v: Vec<&[u8]>) -> Self
Creates a GenericBinaryArray from a vector of byte slices
See also Self::from_iter_values
sourcepub fn from_opt_vec(v: Vec<Option<&[u8]>>) -> Self
pub fn from_opt_vec(v: Vec<Option<&[u8]>>) -> Self
Creates a GenericBinaryArray from a vector of Optional (null) byte slices
sourcepub fn from_iter_values<Ptr, I>(iter: I) -> Selfwhere
Ptr: AsRef<[u8]>,
I: IntoIterator<Item = Ptr>,
pub fn from_iter_values<Ptr, I>(iter: I) -> Selfwhere Ptr: AsRef<[u8]>, I: IntoIterator<Item = Ptr>,
Creates a GenericBinaryArray
based on an iterator of values without nulls
sourcepub fn take_iter<'a>(
&'a self,
indexes: impl Iterator<Item = Option<usize>> + 'a
) -> impl Iterator<Item = Option<&[u8]>> + 'a
pub fn take_iter<'a>( &'a self, indexes: impl Iterator<Item = Option<usize>> + 'a ) -> impl Iterator<Item = Option<&[u8]>> + 'a
Returns an iterator that returns the values of array.value(i)
for an iterator with each element i
sourcepub unsafe fn take_iter_unchecked<'a>(
&'a self,
indexes: impl Iterator<Item = Option<usize>> + 'a
) -> impl Iterator<Item = Option<&[u8]>> + 'a
pub unsafe fn take_iter_unchecked<'a>( &'a self, indexes: impl Iterator<Item = Option<usize>> + 'a ) -> impl Iterator<Item = Option<&[u8]>> + 'a
Returns an iterator that returns the values of array.value(i)
for an iterator with each element i
Safety
caller must ensure that the indexes in the iterator are less than the array.len()
source§impl<T: ByteArrayType> GenericByteArray<T>
impl<T: ByteArrayType> GenericByteArray<T>
sourcepub fn new(
offsets: OffsetBuffer<T::Offset>,
values: Buffer,
nulls: Option<NullBuffer>
) -> Self
pub fn new( offsets: OffsetBuffer<T::Offset>, values: Buffer, nulls: Option<NullBuffer> ) -> Self
Create a new GenericByteArray
from the provided parts, panicking on failure
Panics
Panics if GenericByteArray::try_new
returns an error
sourcepub fn try_new(
offsets: OffsetBuffer<T::Offset>,
values: Buffer,
nulls: Option<NullBuffer>
) -> Result<Self, ArrowError>
pub fn try_new( offsets: OffsetBuffer<T::Offset>, values: Buffer, nulls: Option<NullBuffer> ) -> Result<Self, ArrowError>
Create a new GenericByteArray
from the provided parts, returning an error on failure
Errors
offsets.len() - 1 != nulls.len()
- Any consecutive pair of
offsets
does not denote a valid slice ofvalues
sourcepub fn new_unchecked(
offsets: OffsetBuffer<T::Offset>,
values: Buffer,
nulls: Option<NullBuffer>
) -> Self
pub fn new_unchecked( offsets: OffsetBuffer<T::Offset>, values: Buffer, nulls: Option<NullBuffer> ) -> Self
Create a new GenericByteArray
from the provided parts, without validation
Safety
Safe if Self::try_new
would not error
sourcepub fn new_null(len: usize) -> Self
pub fn new_null(len: usize) -> Self
Create a new GenericByteArray
of length len
where all values are null
sourcepub fn into_parts(self) -> (OffsetBuffer<T::Offset>, Buffer, Option<NullBuffer>)
pub fn into_parts(self) -> (OffsetBuffer<T::Offset>, Buffer, Option<NullBuffer>)
Deconstruct this array into its constituent parts
sourcepub fn value_length(&self, i: usize) -> T::Offset
pub fn value_length(&self, i: usize) -> T::Offset
sourcepub fn offsets(&self) -> &OffsetBuffer<T::Offset>
pub fn offsets(&self) -> &OffsetBuffer<T::Offset>
Returns a reference to the offsets of this array
Unlike Self::value_offsets
this returns the OffsetBuffer
allowing for zero-copy cloning
sourcepub fn values(&self) -> &Buffer
pub fn values(&self) -> &Buffer
Returns the values of this array
Unlike Self::value_data
this returns the Buffer
allowing for zero-copy cloning
sourcepub fn value_data(&self) -> &[u8] ⓘ
pub fn value_data(&self) -> &[u8] ⓘ
Returns the raw value data
sourcepub fn value_offsets(&self) -> &[T::Offset]
pub fn value_offsets(&self) -> &[T::Offset]
Returns the offset values in the offsets buffer
sourcepub unsafe fn value_unchecked(&self, i: usize) -> &T::Native
pub unsafe fn value_unchecked(&self, i: usize) -> &T::Native
Returns the element at index i
Safety
Caller is responsible for ensuring that the index is within the bounds of the array
sourcepub fn slice(&self, offset: usize, length: usize) -> Self
pub fn slice(&self, offset: usize, length: usize) -> Self
Returns a zero-copy slice of this array with the indicated offset and length.
sourcepub fn into_builder(self) -> Result<GenericByteBuilder<T>, Self>
pub fn into_builder(self) -> Result<GenericByteBuilder<T>, Self>
Returns GenericByteBuilder
of this byte array for mutating its values if the underlying
offset and data buffers are not shared by others.
source§impl<OffsetSize: OffsetSizeTrait> GenericByteArray<GenericStringType<OffsetSize>>
impl<OffsetSize: OffsetSizeTrait> GenericByteArray<GenericStringType<OffsetSize>>
sourcepub const fn get_data_type() -> DataType
👎Deprecated: please use Self::DATA_TYPE
instead
pub const fn get_data_type() -> DataType
Self::DATA_TYPE
insteadGet the data type of the array.
sourcepub fn num_chars(&self, i: usize) -> usize
pub fn num_chars(&self, i: usize) -> usize
Returns the number of Unicode Scalar Value
in the string at index i
.
Performance
This function has O(n)
time complexity where n
is the string length.
If you can make sure that all chars in the string are in the range U+0x0000
~ U+0x007F
,
please use the function value_length
which has O(1) time complexity.
sourcepub fn from_iter_values<Ptr, I>(iter: I) -> Selfwhere
Ptr: AsRef<str>,
I: IntoIterator<Item = Ptr>,
pub fn from_iter_values<Ptr, I>(iter: I) -> Selfwhere Ptr: AsRef<str>, I: IntoIterator<Item = Ptr>,
Creates a GenericStringArray
based on an iterator of values without nulls
sourcepub fn take_iter<'a>(
&'a self,
indexes: impl Iterator<Item = Option<usize>> + 'a
) -> impl Iterator<Item = Option<&str>> + 'a
pub fn take_iter<'a>( &'a self, indexes: impl Iterator<Item = Option<usize>> + 'a ) -> impl Iterator<Item = Option<&str>> + 'a
Returns an iterator that returns the values of array.value(i)
for an iterator with each element i
sourcepub unsafe fn take_iter_unchecked<'a>(
&'a self,
indexes: impl Iterator<Item = Option<usize>> + 'a
) -> impl Iterator<Item = Option<&str>> + 'a
pub unsafe fn take_iter_unchecked<'a>( &'a self, indexes: impl Iterator<Item = Option<usize>> + 'a ) -> impl Iterator<Item = Option<&str>> + 'a
Returns an iterator that returns the values of array.value(i)
for an iterator with each element i
Safety
caller must ensure that the indexes in the iterator are less than the array.len()
sourcepub fn try_from_binary(
v: GenericBinaryArray<OffsetSize>
) -> Result<Self, ArrowError>
pub fn try_from_binary( v: GenericBinaryArray<OffsetSize> ) -> Result<Self, ArrowError>
Fallibly creates a GenericStringArray
from a GenericBinaryArray
returning
an error if GenericBinaryArray
contains invalid UTF-8 data
Trait Implementations§
source§impl<T: ByteArrayType> Array for GenericByteArray<T>
impl<T: ByteArrayType> Array for GenericByteArray<T>
source§fn slice(&self, offset: usize, length: usize) -> ArrayRef
fn slice(&self, offset: usize, length: usize) -> ArrayRef
source§fn offset(&self) -> usize
fn offset(&self) -> usize
0
. Read moresource§fn nulls(&self) -> Option<&NullBuffer>
fn nulls(&self) -> Option<&NullBuffer>
source§fn get_buffer_memory_size(&self) -> usize
fn get_buffer_memory_size(&self) -> usize
source§fn get_array_memory_size(&self) -> usize
fn get_array_memory_size(&self) -> usize
get_buffer_memory_size()
and
includes the overhead of the data structures that contain the pointers to the various buffers.source§fn is_null(&self, index: usize) -> bool
fn is_null(&self, index: usize) -> bool
index
is null.
When using this function on a slice, the index is relative to the slice. Read moresource§fn is_valid(&self, index: usize) -> bool
fn is_valid(&self, index: usize) -> bool
index
is not null.
When using this function on a slice, the index is relative to the slice. Read more