Struct datafusion::functions_aggregate::regr::RegrAccumulator

source ·

pub struct RegrAccumulator { /* private fields */ }

Expand description

RegrAccumulator is used to compute linear regression aggregate functions by maintaining statistics needed to compute them in an online fashion.

This struct uses Welford’s online algorithm for calculating variance and covariance: https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Welford's_online_algorithm

Given the statistics, the following aggregate functions can be calculated:

regr_slope(y, x): Slope of the linear regression line, calculated as: cov_pop(x, y) / var_pop(x). It represents the expected change in Y for a one-unit change in X.
regr_intercept(y, x): Intercept of the linear regression line, calculated as: mean_y - (regr_slope(y, x) * mean_x). It represents the expected value of Y when X is 0.
regr_count(y, x): Count of the non-null(both x and y) input rows.
regr_r2(y, x): R-squared value (coefficient of determination), calculated as: (cov_pop(x, y) ^ 2) / (var_pop(x) * var_pop(y)). It provides a measure of how well the model’s predictions match the observed data.
regr_avgx(y, x): Average of the independent variable X, calculated as: mean_x.
regr_avgy(y, x): Average of the dependent variable Y, calculated as: mean_y.
regr_sxx(y, x): Sum of squares of the independent variable X, calculated as: m2_x.
regr_syy(y, x): Sum of squares of the dependent variable Y, calculated as: m2_y.
regr_sxy(y, x): Sum of products of paired values, calculated as: algo_const.

Here’s how the statistics maintained in this struct are calculated:

cov_pop(x, y): algo_const / count.
var_pop(x): m2_x / count.
var_pop(y): m2_y / count.

Implementations§

source §

impl RegrAccumulator

source

pub fn try_new(regr_type: &RegrType) -> Result<RegrAccumulator, DataFusionError>

Creates a new RegrAccumulator

Trait Implementations§

source §

impl Accumulator for RegrAccumulator

source §

fn state(&mut self) -> Result<Vec<ScalarValue>, DataFusionError>

Returns the intermediate state of the accumulator, consuming the intermediate state. Read more

source §

fn update_batch( &mut self, values: &[Arc<dyn Array>], ) -> Result<(), DataFusionError>

Updates the accumulator’s state from its input. Read more

source §

fn supports_retract_batch(&self) -> bool

Does the accumulator support incrementally updating its value by removing values. Read more

source §

fn retract_batch( &mut self, values: &[Arc<dyn Array>], ) -> Result<(), DataFusionError>

Retracts (removed) an update (caused by the given inputs) to accumulator’s state. Read more

source §

fn merge_batch( &mut self, states: &[Arc<dyn Array>], ) -> Result<(), DataFusionError>

Updates the accumulator’s state from an Array containing one or more intermediate values. Read more

source §

fn evaluate(&mut self) -> Result<ScalarValue, DataFusionError>

Returns the final aggregate value, consuming the internal state. Read more

source §

fn size(&self) -> usize

Returns the allocated size required for this accumulator, in bytes, including Self. Read more

source §

impl Debug for RegrAccumulator

source §

fn fmt(&self, f: &mut Formatter<'_>) -> Result<(), Error>

Formats the value using the given formatter. Read more

Auto Trait Implementations§

§

impl UnwindSafe for RegrAccumulator

Blanket Implementations§

source §

impl<T> Any for T
where T: 'static + ?Sized,

source §

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more

source §

impl<T> Borrow<T> for T
where T: ?Sized,

source §

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more

source §

impl<T> BorrowMut<T> for T
where T: ?Sized,

source §

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more

source §

impl<T> From<T> for T

source §

fn from(t: T) -> T

Returns the argument unchanged.

source §

impl<T, U> Into for T
where U: From<T>,

source §

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

source §

impl<T> IntoEither for T

source §

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more

source §

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more

source §

impl<T> Same for T

§

type Output = T

Should always be Self

source §

impl<T, U> TryFrom for T
where U: Into<T>,

§

type Error = Infallible

The type returned in the event of a conversion error.

source §

fn try_from(value: U) -> Result<T, <T as TryFrom>::Error>

Performs the conversion.

source §

impl<T, U> TryInto for T
where U: TryFrom<T>,

§

type Error = >::Error

The type returned in the event of a conversion error.

source §

fn try_into(self) -> Result<U, >::Error>

Performs the conversion.

source §

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

source §

fn vzip(self) -> V

source §

Struct datafusion::functions_aggregate::regr::RegrAccumulatorCopy item path

Implementations§

impl RegrAccumulator

pub fn try_new(regr_type: &RegrType) -> Result<RegrAccumulator, DataFusionError>

Trait Implementations§

impl Accumulator for RegrAccumulator

fn state(&mut self) -> Result<Vec<ScalarValue>, DataFusionError>

fn update_batch( &mut self, values: &[Arc<dyn Array>], ) -> Result<(), DataFusionError>

fn supports_retract_batch(&self) -> bool

fn retract_batch( &mut self, values: &[Arc<dyn Array>], ) -> Result<(), DataFusionError>

fn merge_batch( &mut self, states: &[Arc<dyn Array>], ) -> Result<(), DataFusionError>

fn evaluate(&mut self) -> Result<ScalarValue, DataFusionError>

fn size(&self) -> usize

impl Debug for RegrAccumulator

fn fmt(&self, f: &mut Formatter<'_>) -> Result<(), Error>

Auto Trait Implementations§

impl Freeze for RegrAccumulator

impl RefUnwindSafe for RegrAccumulator

impl Send for RegrAccumulator

impl Sync for RegrAccumulator

impl Unpin for RegrAccumulator

impl UnwindSafe for RegrAccumulator

Blanket Implementations§

impl<T> Any for Twhere T: 'static + ?Sized,

fn type_id(&self) -> TypeId

impl<T> Borrow<T> for Twhere T: ?Sized,

fn borrow(&self) -> &T

impl<T> BorrowMut<T> for Twhere T: ?Sized,

fn borrow_mut(&mut self) -> &mut T

impl<T> From<T> for T

fn from(t: T) -> T

impl<T, U> Into<U> for Twhere U: From<T>,

fn into(self) -> U

impl<T> IntoEither for T

fn into_either(self, into_left: bool) -> Either<Self, Self>

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>where F: FnOnce(&Self) -> bool,

impl<T> Same for T

type Output = T

impl<T, U> TryFrom<U> for Twhere U: Into<T>,

type Error = Infallible

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

impl<T, U> TryInto<U> for Twhere U: TryFrom<T>,

type Error = <U as TryFrom<T>>::Error

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

impl<V, T> VZip<V> for Twhere V: MultiLane<T>,

fn vzip(self) -> V

impl<T> Allocation for Twhere T: RefUnwindSafe + Send + Sync,

Struct datafusion::functions_aggregate::regr::RegrAccumulator

impl<T> Any for T
where T: 'static + ?Sized,

impl<T> Borrow<T> for T
where T: ?Sized,

impl<T> BorrowMut<T> for T
where T: ?Sized,

impl<T, U> Into<U> for T
where U: From<T>,

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

impl<T, U> TryFrom<U> for T
where U: Into<T>,

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

impl<T> Allocation for T
where T: RefUnwindSafe + Send + Sync,