Struct datafusion::functions_aggregate::regr::RegrAccumulator
source · pub struct RegrAccumulator { /* private fields */ }
Expand description
RegrAccumulator
is used to compute linear regression aggregate functions
by maintaining statistics needed to compute them in an online fashion.
This struct uses Welford’s online algorithm for calculating variance and covariance: https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Welford's_online_algorithm
Given the statistics, the following aggregate functions can be calculated:
-
regr_slope(y, x)
: Slope of the linear regression line, calculated as: cov_pop(x, y) / var_pop(x). It represents the expected change in Y for a one-unit change in X. -
regr_intercept(y, x)
: Intercept of the linear regression line, calculated as: mean_y - (regr_slope(y, x) * mean_x). It represents the expected value of Y when X is 0. -
regr_count(y, x)
: Count of the non-null(both x and y) input rows. -
regr_r2(y, x)
: R-squared value (coefficient of determination), calculated as: (cov_pop(x, y) ^ 2) / (var_pop(x) * var_pop(y)). It provides a measure of how well the model’s predictions match the observed data. -
regr_avgx(y, x)
: Average of the independent variable X, calculated as: mean_x. -
regr_avgy(y, x)
: Average of the dependent variable Y, calculated as: mean_y. -
regr_sxx(y, x)
: Sum of squares of the independent variable X, calculated as: m2_x. -
regr_syy(y, x)
: Sum of squares of the dependent variable Y, calculated as: m2_y. -
regr_sxy(y, x)
: Sum of products of paired values, calculated as: algo_const.
Here’s how the statistics maintained in this struct are calculated:
cov_pop(x, y)
: algo_const / count.var_pop(x)
: m2_x / count.var_pop(y)
: m2_y / count.
Implementations§
source§impl RegrAccumulator
impl RegrAccumulator
sourcepub fn try_new(regr_type: &RegrType) -> Result<RegrAccumulator, DataFusionError>
pub fn try_new(regr_type: &RegrType) -> Result<RegrAccumulator, DataFusionError>
Creates a new RegrAccumulator
Trait Implementations§
source§impl Accumulator for RegrAccumulator
impl Accumulator for RegrAccumulator
source§fn state(&mut self) -> Result<Vec<ScalarValue>, DataFusionError>
fn state(&mut self) -> Result<Vec<ScalarValue>, DataFusionError>
source§fn update_batch(
&mut self,
values: &[Arc<dyn Array>],
) -> Result<(), DataFusionError>
fn update_batch( &mut self, values: &[Arc<dyn Array>], ) -> Result<(), DataFusionError>
source§fn supports_retract_batch(&self) -> bool
fn supports_retract_batch(&self) -> bool
source§fn retract_batch(
&mut self,
values: &[Arc<dyn Array>],
) -> Result<(), DataFusionError>
fn retract_batch( &mut self, values: &[Arc<dyn Array>], ) -> Result<(), DataFusionError>
source§fn merge_batch(
&mut self,
states: &[Arc<dyn Array>],
) -> Result<(), DataFusionError>
fn merge_batch( &mut self, states: &[Arc<dyn Array>], ) -> Result<(), DataFusionError>
Array
containing one
or more intermediate values. Read moresource§fn evaluate(&mut self) -> Result<ScalarValue, DataFusionError>
fn evaluate(&mut self) -> Result<ScalarValue, DataFusionError>
Auto Trait Implementations§
impl Freeze for RegrAccumulator
impl RefUnwindSafe for RegrAccumulator
impl Send for RegrAccumulator
impl Sync for RegrAccumulator
impl Unpin for RegrAccumulator
impl UnwindSafe for RegrAccumulator
Blanket Implementations§
source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
source§impl<T> IntoEither for T
impl<T> IntoEither for T
source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self
into a Left
variant of Either<Self, Self>
if into_left
is true
.
Converts self
into a Right
variant of Either<Self, Self>
otherwise. Read moresource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self
into a Left
variant of Either<Self, Self>
if into_left(&self)
returns true
.
Converts self
into a Right
variant of Either<Self, Self>
otherwise. Read more