Struct datafusion_physical_expr_common::aggregate::groups_accumulator::accumulate::NullState
source · pub struct NullState { /* private fields */ }
Expand description
Track the accumulator null state per row: if any values for that group were null and if any values have been seen at all for that group.
This is part of the inner loop for many GroupsAccumulator
s,
and thus the performance is critical and so there are multiple
specialized implementations, invoked depending on the specific
combinations of the input.
Typically there are 4 potential combinations of inputs must be special cased for performance:
- With / Without filter
- With / Without nulls in the input
If the input has nulls, then the accumulator must potentially
handle each input null value specially (e.g. for SUM
to mark the
corresponding sum as null)
If there are filters present, NullState
tracks if it has seen
any value for that group (as some values may be filtered
out). Without a filter, the accumulator is only passed groups that
had at least one value to accumulate so they do not need to track
if they have seen values for a particular group.
Implementations§
source§impl NullState
impl NullState
pub fn new() -> Self
sourcepub fn size(&self) -> usize
pub fn size(&self) -> usize
return the size of all buffers allocated by this null state, not including self
sourcepub fn accumulate<T, F>(
&mut self,
group_indices: &[usize],
values: &PrimitiveArray<T>,
opt_filter: Option<&BooleanArray>,
total_num_groups: usize,
value_fn: F,
)
pub fn accumulate<T, F>( &mut self, group_indices: &[usize], values: &PrimitiveArray<T>, opt_filter: Option<&BooleanArray>, total_num_groups: usize, value_fn: F, )
Invokes value_fn(group_index, value)
for each non null, non
filtered value of value
, while tracking which groups have
seen null inputs and which groups have seen any inputs if necessary
§Arguments:
values
: the input arguments to the accumulatorgroup_indices
: To which groups do the rows invalues
belong, (aka group_index)opt_filter
: if present, only rows for which is Some(true) are includedvalue_fn
: function invoked for (group_index, value) where value is non null
§Example
┌─────────┐ ┌─────────┐ ┌ ─ ─ ─ ─ ┐
│ ┌─────┐ │ │ ┌─────┐ │ ┌─────┐
│ │ 2 │ │ │ │ 200 │ │ │ │ t │ │
│ ├─────┤ │ │ ├─────┤ │ ├─────┤
│ │ 2 │ │ │ │ 100 │ │ │ │ f │ │
│ ├─────┤ │ │ ├─────┤ │ ├─────┤
│ │ 0 │ │ │ │ 200 │ │ │ │ t │ │
│ ├─────┤ │ │ ├─────┤ │ ├─────┤
│ │ 1 │ │ │ │ 200 │ │ │ │NULL │ │
│ ├─────┤ │ │ ├─────┤ │ ├─────┤
│ │ 0 │ │ │ │ 300 │ │ │ │ t │ │
│ └─────┘ │ │ └─────┘ │ └─────┘
└─────────┘ └─────────┘ └ ─ ─ ─ ─ ┘
group_indices values opt_filter
In the example above, value_fn
is invoked for each (group_index,
value) pair where opt_filter[i]
is true and values is non null
value_fn(2, 200)
value_fn(0, 200)
value_fn(0, 300)
It also sets
self.seen_values[group_index]
to true for all rows that had a non null vale
sourcepub fn accumulate_boolean<F>(
&mut self,
group_indices: &[usize],
values: &BooleanArray,
opt_filter: Option<&BooleanArray>,
total_num_groups: usize,
value_fn: F,
)
pub fn accumulate_boolean<F>( &mut self, group_indices: &[usize], values: &BooleanArray, opt_filter: Option<&BooleanArray>, total_num_groups: usize, value_fn: F, )
Invokes value_fn(group_index, value)
for each non null, non
filtered value in values
, while tracking which groups have
seen null inputs and which groups have seen any inputs, for
BooleanArray
s.
Since BooleanArray
is not a PrimitiveArray
it must be
handled specially.
See Self::accumulate
, which handles PrimitiveArray
s, for
more details on other arguments.
sourcepub fn build(&mut self, emit_to: EmitTo) -> NullBuffer
pub fn build(&mut self, emit_to: EmitTo) -> NullBuffer
Creates the a NullBuffer
representing which group_indices
should have null values (because they never saw any values)
for the emit_to
rows.
resets the internal state appropriately