pub fn slow_u128_divrem(n: u128, d: u64, d_ctlz: u32) -> (u128, u64)
Expand description
Optimized fallback division/remainder algorithm for u128.
This is because the code generation for u128 divrem is very inefficient
in Rust, calling both __udivmodti4
twice internally, rather than a single
time.
This is still a fair bit slower than the optimized algorithms described in the above paper, but this is a suitable fallback when we cannot use the faster algorithm.