Struct datafusion_physical_plan::joins::utils::JoinHashMap
source · pub struct JoinHashMap { /* private fields */ }
Expand description
Maps a u64
hash value based on the build side [“on” values] to a list of indices with this key’s value.
By allocating a HashMap
with capacity for at least the number of rows for entries at the build side,
we make sure that we don’t have to re-hash the hashmap, which needs access to the key (the hash in this case) value.
E.g. 1 -> [3, 6, 8] indicates that the column values map to rows 3, 6 and 8 for hash value 1
As the key is a hash value, we need to check possible hash collisions in the probe stage
During this stage it might be the case that a row is contained the same hashmap value,
but the values don’t match. Those are checked in the equal_rows_arr
method.
The indices (values) are stored in a separate chained list stored in the Vec<u64>
.
The first value (+1) is stored in the hashmap, whereas the next value is stored in array at the position value.
The chain can be followed until the value “0” has been reached, meaning the end of the list. Also see chapter 5.3 of Balancing vectorized query execution with bandwidth-optimized storage
§Example
See the example below:
Insert (10,1) <-- insert hash value 10 with row index 1
map:
----------
| 10 | 2 |
----------
next:
---------------------
| 0 | 0 | 0 | 0 | 0 |
---------------------
Insert (20,2)
map:
----------
| 10 | 2 |
| 20 | 3 |
----------
next:
---------------------
| 0 | 0 | 0 | 0 | 0 |
---------------------
Insert (10,3) <-- collision! row index 3 has a hash value of 10 as well
map:
----------
| 10 | 4 |
| 20 | 3 |
----------
next:
---------------------
| 0 | 0 | 0 | 2 | 0 | <--- hash value 10 maps to 4,2 (which means indices values 3,1)
---------------------
Insert (10,4) <-- another collision! row index 4 ALSO has a hash value of 10
map:
---------
| 10 | 5 |
| 20 | 3 |
---------
next:
---------------------
| 0 | 0 | 0 | 2 | 4 | <--- hash value 10 maps to 5,4,2 (which means indices values 4,3,1)
---------------------
Trait Implementations§
source§impl Debug for JoinHashMap
impl Debug for JoinHashMap
source§impl JoinHashMapType for JoinHashMap
impl JoinHashMapType for JoinHashMap
Implementation of JoinHashMapType
for JoinHashMap
.
source§fn get_mut(&mut self) -> (&mut RawTable<(u64, u64)>, &mut Self::NextType)
fn get_mut(&mut self) -> (&mut RawTable<(u64, u64)>, &mut Self::NextType)
Get mutable references to the hash map and the next.
source§fn extend_zero(&mut self, _: usize)
fn extend_zero(&mut self, _: usize)
source§fn update_from_iter<'a>(
&mut self,
iter: impl Iterator<Item = (usize, &'a u64)>,
deleted_offset: usize,
)
fn update_from_iter<'a>( &mut self, iter: impl Iterator<Item = (usize, &'a u64)>, deleted_offset: usize, )
source§fn get_matched_indices<'a>(
&self,
iter: impl Iterator<Item = (usize, &'a u64)>,
deleted_offset: Option<usize>,
) -> (UInt32BufferBuilder, UInt64BufferBuilder)
fn get_matched_indices<'a>( &self, iter: impl Iterator<Item = (usize, &'a u64)>, deleted_offset: Option<usize>, ) -> (UInt32BufferBuilder, UInt64BufferBuilder)
source§fn get_matched_indices_with_limit_offset(
&self,
hash_values: &[u64],
deleted_offset: Option<usize>,
limit: usize,
offset: (usize, Option<u64>),
) -> (UInt32BufferBuilder, UInt64BufferBuilder, Option<(usize, Option<u64>)>)
fn get_matched_indices_with_limit_offset( &self, hash_values: &[u64], deleted_offset: Option<usize>, limit: usize, offset: (usize, Option<u64>), ) -> (UInt32BufferBuilder, UInt64BufferBuilder, Option<(usize, Option<u64>)>)
None
if limit has not been reached). Read moreAuto Trait Implementations§
impl Freeze for JoinHashMap
impl RefUnwindSafe for JoinHashMap
impl Send for JoinHashMap
impl Sync for JoinHashMap
impl Unpin for JoinHashMap
impl UnwindSafe for JoinHashMap
Blanket Implementations§
source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
source§impl<T> IntoEither for T
impl<T> IntoEither for T
source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self
into a Left
variant of Either<Self, Self>
if into_left
is true
.
Converts self
into a Right
variant of Either<Self, Self>
otherwise. Read moresource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self
into a Left
variant of Either<Self, Self>
if into_left(&self)
returns true
.
Converts self
into a Right
variant of Either<Self, Self>
otherwise. Read more