Struct Fid

Source

pub struct Fid { /* private fields */ }

Expand description

FID (Fully Indexable Dictionary).

This class can handle bit sequence of virtually arbitrary length.

In fact, N (FID’s length) is designed to be limited to: N <= 2^64.
It should be enough for almost all usecases since a binary data of length of 2^64 consumes 2^21 = 2,097,152 TB (terabyte), which is hard to handle by state-of-the-art computer architecture.

§Implementation detail

Index<u64>’s implementation is trivial.

select() just uses binary search of rank() results.

rank()’s implementation is standard but non-trivial. So here explains implementation of rank().

§rank()’s implementation

Say you have the following bit vector.

00001000 01000001 00000100 11000000 00100000 00000101 10100000 00010000 001 ; (N=67)

Answer rank(48) in O(1) time-complexity and o(N) space-complexity.

Naively, you can count the number of ‘1’ from left to right. You will find rank(48) == 10 but it took O(N) time-complexity.

To reduce time-complexity to O(1), you can use memonization technique.
Of course, you can memonize results of rank(i) for every i ([0, N-1]).

Bit vector;   0  0  0  0  1  0  0  0  0  1  0  0  0  0  0  1  0  0  0  0  0  1  0  0  1  1  0  0  0  0  0  0  0  0  1  0  0  0  0  0  0  0  0  0  0  1  0  1  [1]  0  1  0  0  0  0  0  0  0  0  1  0  0  0  0  0  0  1 ; (N=67)
Memo rank(i); 0  0  0  0  1  1  1  1  1  2  2  2  2  2  2  3  3  3  3  3  3  4  4  4  5  6  6  6  6  6  6  6  6  6  7  7  7  7  7  7  7  7  7  7  7  8  8  9  10  10 11 11 11 11 11 11 11 11 11 12 12 12 12 12 12 12 13

From this memo, you can answer rank(48) == 10 in constant time, although space-complexity for this memo is O(N) > o(N).

To reduce space-complexity using memonization, we divide the bit vector into Chunk and Block.

Bit vector; 00001000 01000001 00000100 11000000 00100000 00000101 [1]0100000 00010000 001  ; (N=67)
Chunk;     |                  7                    |                13                  |  ; (size = (log N)^2 = 36)
Block;     |0 |1 |1  |2 |2 |3  |3 |4 |6  |6 |6  |7 |0 |0  |0 |2 |4    |4 |4  |5 |5 |5  |6| ; (size = (log N) / 2 = 3)

A Chunk has size of (log N)^2. Its value is rank(index of the last bit of the chunk).
A Block has size of (log N) / 2. A chunk has many blocks. Block’s value is the number of ’1’s in [index of the first bit of the chunk the block belongs to, index of the last bit of the block] (note that the value is reset to 0 at the first bit of a chunk).

Now you want to answer rank(48). 48-th bit is in the 2nd chunk, and in the 5th block in the chunk.
So the rank(48) is at least:

7 (value of 1st chunk) + 2 (value of 4th block in the 2nd chunk)

Then, focus on 3 bits in 5th block in the 2nd chunk; [1]01.
As you can see, only 1 ‘1’ is included up to 48-th bit (101 has 2 ’1’s but 2nd ‘1’ is 50-th bit, irrelevant to rank(48)).

Therefore, the rank(48) is calculated as:

7 (value of 1st chunk) + 2 (value of 4th block in the 2nd chunk) + 1 (’1’s in 5th block up to 48-th bit)

OK. That’s all… Wait!
rank() must be in O(1) time-complexity.

7 (value of 1st chunk): O(1) if you store chunk value in array structure.
2 (value of 4th block in the 2nd chunk): Same as above.
1 (’1’s in 5th block up to 48-th bit): O(length of block) = O(log N) !

Counting ’1’s in a block must also be O(1), while using o(N) space.
We use Table for this purpose.

Block content	Number of ’1’s in block
`000`	0
`001`	1
`010`	1
`011`	2
`100`	1
`101`	2
`110`	2
`111`	3

This table is constructed in build(). So we can find the number of ’1’s in block in O(1) time.
Note that this table has O(log N) = o(N) length.

In summary:

rank() = (value of left chunk) + (value of left block) + (value of table keyed by inner block bits).

Struct FidCopy item path

§Implementation detail

§rank()’s implementation

Implementations§

impl Fid

pub fn rank(&self, i: u64) -> u64

§Panics

§Implementation detail

pub fn rank0(&self, i: u64) -> u64

§Panics

pub fn select(&self, num: u64) -> Option<u64>

§Panics

§Implementation detail

pub fn select0(&self, num: u64) -> Option<u64>

§Panics

pub fn len(&self) -> u64

pub fn is_empty(&self) -> bool

impl<'iter> Fid

pub fn iter(&'iter self) -> FidIter<'iter> ⓘ

§Examples

Trait Implementations§

impl Clone for Fid

fn clone(&self) -> Fid

fn clone_from(&mut self, source: &Self)

impl Debug for Fid

fn fmt(&self, f: &mut Formatter<'_>) -> Result

impl From<&[bool]> for Fid

fn from(bits: &[bool]) -> Self

§Examples

§Panics

impl From<&str> for Fid

fn from(s: &str) -> Self

§Examples

§Panics

impl Index<u64> for Fid

fn index(&self, index: u64) -> &Self::Output

§Panics

type Output = bool

Auto Trait Implementations§

impl Freeze for Fid

impl RefUnwindSafe for Fid

impl Send for Fid

impl Sync for Fid

impl Unpin for Fid

impl UnwindSafe for Fid

Blanket Implementations§

impl<T> Any for Twhere T: 'static + ?Sized,

fn type_id(&self) -> TypeId

impl<T> Borrow<T> for Twhere T: ?Sized,

fn borrow(&self) -> &T

impl<T> BorrowMut<T> for Twhere T: ?Sized,

fn borrow_mut(&mut self) -> &mut T

impl<T> CloneToUninit for Twhere T: Clone,

unsafe fn clone_to_uninit(&self, dst: *mut u8)

impl<T> From<T> for T

fn from(t: T) -> T

impl<T, U> Into<U> for Twhere U: From<T>,

fn into(self) -> U

impl<T> IntoEither for T

fn into_either(self, into_left: bool) -> Either<Self, Self>

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>where F: FnOnce(&Self) -> bool,

impl<T> Pointable for T

const ALIGN: usize

type Init = T

unsafe fn init(init: <T as Pointable>::Init) -> usize

unsafe fn deref<'a>(ptr: usize) -> &'a T

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

unsafe fn drop(ptr: usize)

impl<T> ToOwned for Twhere T: Clone,

type Owned = T

fn to_owned(&self) -> T

fn clone_into(&self, target: &mut T)

impl<T, U> TryFrom<U> for Twhere U: Into<T>,

type Error = Infallible

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

impl<T, U> TryInto<U> for Twhere U: TryFrom<T>,

type Error = <U as TryFrom<T>>::Error

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Struct Fid

impl<T> Any for T
where T: 'static + ?Sized,

impl<T> Borrow<T> for T
where T: ?Sized,

impl<T> BorrowMut<T> for T
where T: ?Sized,

impl<T> CloneToUninit for T
where T: Clone,

impl<T, U> Into<U> for T
where U: From<T>,

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

impl<T> ToOwned for T
where T: Clone,

impl<T, U> TryFrom<U> for T
where U: Into<T>,

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,