datafusion_physical_expr

Struct GroupsAccumulatorAdapter

Source
pub struct GroupsAccumulatorAdapter { /* private fields */ }
Expand description

An adapter that implements GroupsAccumulator for any Accumulator

While Accumulator are simpler to implement and can support more general calculations (like retractable window functions), they are not as fast as a specialized GroupsAccumulator. This interface bridges the gap so the group by operator only operates in terms of Accumulator.

Internally, this adapter creates a new Accumulator for each group which stores the state for that group. This both requires an allocation for each Accumulator, internal indices, as well as whatever internal allocations the Accumulator itself requires.

For example, a MinAccumulator that computes the minimum string value with a ScalarValue::Utf8. That will require at least two allocations per group (one for the MinAccumulator and one for the ScalarValue::Utf8).

                      ┌─────────────────────────────────┐
                      │MinAccumulator {                 │
               ┌─────▶│ min: ScalarValue::Utf8("A")     │───────┐
               │      │}                                │       │
               │      └─────────────────────────────────┘       └───────▶   "A"
   ┌─────┐     │      ┌─────────────────────────────────┐
   │  0  │─────┘      │MinAccumulator {                 │
   ├─────┤     ┌─────▶│ min: ScalarValue::Utf8("Z")     │───────────────▶   "Z"
   │  1  │─────┘      │}                                │
   └─────┘            └─────────────────────────────────┘                   ...
     ...                 ...
   ┌─────┐            ┌────────────────────────────────┐
   │ N-2 │            │MinAccumulator {                │
   ├─────┤            │  min: ScalarValue::Utf8("A")   │────────────────▶   "A"
   │ N-1 │─────┐      │}                               │
   └─────┘     │      └────────────────────────────────┘
               │      ┌────────────────────────────────┐        ┌───────▶   "Q"
               │      │MinAccumulator {                │        │
               └─────▶│  min: ScalarValue::Utf8("Q")   │────────┘
                      │}                               │
                      └────────────────────────────────┘


 Logical group         Current Min/Max value for that group stored
    number             as a ScalarValue which points to an
                       indivdually allocated String

§Optimizations

The adapter minimizes the number of calls to Accumulator::update_batch by first collecting the input rows for each group into a contiguous array using compute::take

Implementations§

Source§

impl GroupsAccumulatorAdapter

Source

pub fn new<F>(factory: F) -> GroupsAccumulatorAdapter
where F: Fn() -> Result<Box<dyn Accumulator>, DataFusionError> + Send + 'static,

Create a new adapter that will create a new Accumulator for each group, using the specified factory function

Trait Implementations§

Source§

impl GroupsAccumulator for GroupsAccumulatorAdapter

Source§

fn update_batch( &mut self, values: &[Arc<dyn Array>], group_indices: &[usize], opt_filter: Option<&BooleanArray>, total_num_groups: usize, ) -> Result<(), DataFusionError>

Updates the accumulator’s state from its arguments, encoded as a vector of ArrayRefs. Read more
Source§

fn evaluate( &mut self, emit_to: EmitTo, ) -> Result<Arc<dyn Array>, DataFusionError>

Returns the final aggregate value for each group as a single RecordBatch, resetting the internal state. Read more
Source§

fn state( &mut self, emit_to: EmitTo, ) -> Result<Vec<Arc<dyn Array>>, DataFusionError>

Returns the intermediate aggregate state for this accumulator, used for multi-phase grouping, resetting its internal state. Read more
Source§

fn merge_batch( &mut self, values: &[Arc<dyn Array>], group_indices: &[usize], opt_filter: Option<&BooleanArray>, total_num_groups: usize, ) -> Result<(), DataFusionError>

Merges intermediate state (the output from Self::state) into this accumulator’s current state. Read more
Source§

fn size(&self) -> usize

Amount of memory used to store the state of this accumulator, in bytes. Read more
Source§

fn convert_to_state( &self, values: &[Arc<dyn Array>], opt_filter: Option<&BooleanArray>, ) -> Result<Vec<Arc<dyn Array>>, DataFusionError>

Converts an input batch directly to the intermediate aggregate state. Read more
Source§

fn supports_convert_to_state(&self) -> bool

Returns true if Self::convert_to_state is implemented to support intermediate aggregate state conversion.

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.