Function datafusion_common::utils::memory::estimate_memory_size
source · pub fn estimate_memory_size<T>(
num_elements: usize,
fixed_size: usize,
) -> Result<usize>
Expand description
Estimates the memory size required for a hash table prior to allocation.
§Parameters
num_elements
: The number of elements expected in the hash table.fixed_size
: A fixed overhead size associated with the collection (e.g., HashSet or HashTable).T
: The type of elements stored in the hash table.
§Details
This function calculates the estimated memory size by considering:
- An overestimation of buckets to keep approximately 1/8 of them empty.
- The total memory size is computed as:
- The size of each entry (
T
) multiplied by the estimated number of buckets. - One byte overhead for each bucket.
- The fixed size overhead of the collection.
- The size of each entry (
- If the estimation overflows, we return a
DataFusionError
§Examples
§From within a struct
struct MyStruct<T> {
values: Vec<T>,
other_data: usize,
}
impl<T> MyStruct<T> {
fn size(&self) -> Result<usize> {
let num_elements = self.values.len();
let fixed_size = std::mem::size_of_val(self) +
std::mem::size_of_val(&self.values);
estimate_memory_size::<T>(num_elements, fixed_size)
}
}
§With a simple collection
let num_rows = 100;
let fixed_size = std::mem::size_of::<HashMap<u64, u64>>();
let estimated_hashtable_size =
estimate_memory_size::<(u64, u64)>(num_rows,fixed_size)
.expect("Size estimation failed");