Struct tantivy_fst::map::OpBuilder

source ·
pub struct OpBuilder<'m>(/* private fields */);
Expand description

A builder for collecting map streams on which to perform set operations on the keys of maps.

Set operations include intersection, union, difference and symmetric difference. The result of each set operation is itself a stream that emits pairs of keys and a sequence of each occurrence of that key in the participating streams. This information allows one to perform set operations on maps and customize how conflicting output values are handled.

All set operations work efficiently on an arbitrary number of streams with memory proportional to the number of streams.

The algorithmic complexity of all set operations is O(n1 + n2 + n3 + ...) where n1, n2, n3, ... correspond to the number of elements in each stream.

The 'm lifetime parameter refers to the lifetime of the underlying set.

Implementations§

source§

impl<'m> OpBuilder<'m>

source

pub fn new() -> Self

Create a new set operation builder.

source

pub fn add<I, S>(self, streamable: I) -> Selfwhere I: for<'a> IntoStreamer<'a, Into = S, Item = (&'a [u8], u64)>, S: 'm + for<'a> Streamer<'a, Item = (&'a [u8], u64)>,

Add a stream to this set operation.

This is useful for a chaining style pattern, e.g., builder.add(stream1).add(stream2).union().

The stream must emit a lexicographically ordered sequence of key-value pairs.

source

pub fn push<I, S>(&mut self, streamable: I)where I: for<'a> IntoStreamer<'a, Into = S, Item = (&'a [u8], u64)>, S: 'm + for<'a> Streamer<'a, Item = (&'a [u8], u64)>,

Add a stream to this set operation.

The stream must emit a lexicographically ordered sequence of key-value pairs.

source

pub fn union(self) -> Union<'m>

Performs a union operation on all streams that have been added.

Note that this returns a stream of (&[u8], &[IndexedValue]). The first element of the tuple is the byte string key. The second element of the tuple is a list of all occurrences of that key in participating streams. The IndexedValue contains an index and the value associated with that key in that stream. The index uniquely identifies each stream, which is an integer that is auto-incremented when a stream is added to this operation (starting at 0).

Example
use tantivy_fst::{IntoStreamer, Streamer, Map};
use tantivy_fst::map::IndexedValue;

let map1 = Map::from_iter(vec![
    ("a", 1), ("b", 2), ("c", 3),
]).unwrap();
let map2 = Map::from_iter(vec![
    ("a", 11), ("y", 12), ("z", 13),
]).unwrap();

let mut union = map1.op().add(&map2).union();

let mut kvs = vec![];
while let Some((k, vs)) = union.next() {
    kvs.push((k.to_vec(), vs.to_vec()));
}
assert_eq!(kvs, vec![
    (b"a".to_vec(), vec![
        IndexedValue { index: 0, value: 1 },
        IndexedValue { index: 1, value: 11 },
    ]),
    (b"b".to_vec(), vec![IndexedValue { index: 0, value: 2 }]),
    (b"c".to_vec(), vec![IndexedValue { index: 0, value: 3 }]),
    (b"y".to_vec(), vec![IndexedValue { index: 1, value: 12 }]),
    (b"z".to_vec(), vec![IndexedValue { index: 1, value: 13 }]),
]);
source

pub fn chain(self) -> Chain<'m>

Chains all streams that have been added in the order they have been added.

Note that this returns a stream of (&[u8], &[Output]). The first element of the tuple is the byte string key. The second element of the tuple is the value associated with that key in the underlying streams. This is useful when the streams area already sorted and not overlapping.

Example
use tantivy_fst::{IntoStreamer, Streamer, Map};
use tantivy_fst::raw::Output;
use tantivy_fst::map::IndexedValue;

let map1 = Map::from_iter(vec![
    ("a", 1), ("b", 2),
]).unwrap();
let map2 = Map::from_iter(vec![
    ("a", 11), ("y", 12),
]).unwrap();

let mut chain = map1.op().add(&map2).chain();

let mut kvs = vec![];
while let Some((k, v)) = chain.next() {
    kvs.push((k.to_vec(), v));
}
assert_eq!(kvs, vec![
    (b"a".to_vec(), Output::new(1)),
    (b"b".to_vec(), Output::new(2)),
    (b"a".to_vec(), Output::new(11)),
    (b"y".to_vec(), Output::new(12)),
]);
source

pub fn intersection(self) -> Intersection<'m>

Performs an intersection operation on all streams that have been added.

Note that this returns a stream of (&[u8], &[IndexedValue]). The first element of the tuple is the byte string key. The second element of the tuple is a list of all occurrences of that key in participating streams. The IndexedValue contains an index and the value associated with that key in that stream. The index uniquely identifies each stream, which is an integer that is auto-incremented when a stream is added to this operation (starting at 0).

Example
use tantivy_fst::{IntoStreamer, Streamer, Map};
use tantivy_fst::map::IndexedValue;

let map1 = Map::from_iter(vec![
    ("a", 1), ("b", 2), ("c", 3),
]).unwrap();
let map2 = Map::from_iter(vec![
    ("a", 11), ("y", 12), ("z", 13),
]).unwrap();

let mut intersection = map1.op().add(&map2).intersection();

let mut kvs = vec![];
while let Some((k, vs)) = intersection.next() {
    kvs.push((k.to_vec(), vs.to_vec()));
}
assert_eq!(kvs, vec![
    (b"a".to_vec(), vec![
        IndexedValue { index: 0, value: 1 },
        IndexedValue { index: 1, value: 11 },
    ]),
]);
source

pub fn difference(self) -> Difference<'m>

Performs a difference operation with respect to the first stream added. That is, this returns a stream of all elements in the first stream that don’t exist in any other stream that has been added.

Note that this returns a stream of (&[u8], &[IndexedValue]). The first element of the tuple is the byte string key. The second element of the tuple is a list of all occurrences of that key in participating streams. The IndexedValue contains an index and the value associated with that key in that stream. The index uniquely identifies each stream, which is an integer that is auto-incremented when a stream is added to this operation (starting at 0).

Example
use tantivy_fst::{Streamer, Map};
use tantivy_fst::map::IndexedValue;

let map1 = Map::from_iter(vec![
    ("a", 1), ("b", 2), ("c", 3),
]).unwrap();
let map2 = Map::from_iter(vec![
    ("a", 11), ("y", 12), ("z", 13),
]).unwrap();

let mut difference = map1.op().add(&map2).difference();

let mut kvs = vec![];
while let Some((k, vs)) = difference.next() {
    kvs.push((k.to_vec(), vs.to_vec()));
}
assert_eq!(kvs, vec![
    (b"b".to_vec(), vec![IndexedValue { index: 0, value: 2 }]),
    (b"c".to_vec(), vec![IndexedValue { index: 0, value: 3 }]),
]);
source

pub fn symmetric_difference(self) -> SymmetricDifference<'m>

Performs a symmetric difference operation on all of the streams that have been added.

When there are only two streams, then the keys returned correspond to keys that are in either stream but not in both streams.

More generally, for any number of streams, keys that occur in an odd number of streams are returned.

Note that this returns a stream of (&[u8], &[IndexedValue]). The first element of the tuple is the byte string key. The second element of the tuple is a list of all occurrences of that key in participating streams. The IndexedValue contains an index and the value associated with that key in that stream. The index uniquely identifies each stream, which is an integer that is auto-incremented when a stream is added to this operation (starting at 0).

Example
use tantivy_fst::{IntoStreamer, Streamer, Map};
use tantivy_fst::map::IndexedValue;

let map1 = Map::from_iter(vec![
    ("a", 1), ("b", 2), ("c", 3),
]).unwrap();
let map2 = Map::from_iter(vec![
    ("a", 11), ("y", 12), ("z", 13),
]).unwrap();

let mut sym_difference = map1.op().add(&map2).symmetric_difference();

let mut kvs = vec![];
while let Some((k, vs)) = sym_difference.next() {
    kvs.push((k.to_vec(), vs.to_vec()));
}
assert_eq!(kvs, vec![
    (b"b".to_vec(), vec![IndexedValue { index: 0, value: 2 }]),
    (b"c".to_vec(), vec![IndexedValue { index: 0, value: 3 }]),
    (b"y".to_vec(), vec![IndexedValue { index: 1, value: 12 }]),
    (b"z".to_vec(), vec![IndexedValue { index: 1, value: 13 }]),
]);

Trait Implementations§

source§

impl<'f, I, S> Extend<I> for OpBuilder<'f>where I: for<'a> IntoStreamer<'a, Into = S, Item = (&'a [u8], u64)>, S: 'f + for<'a> Streamer<'a, Item = (&'a [u8], u64)>,

source§

fn extend<T>(&mut self, it: T)where T: IntoIterator<Item = I>,

Extends a collection with the contents of an iterator. Read more
source§

fn extend_one(&mut self, item: A)

🔬This is a nightly-only experimental API. (extend_one)
Extends a collection with exactly one element.
source§

fn extend_reserve(&mut self, additional: usize)

🔬This is a nightly-only experimental API. (extend_one)
Reserves capacity in a collection for the given number of additional elements. Read more
source§

impl<'f, I, S> FromIterator<I> for OpBuilder<'f>where I: for<'a> IntoStreamer<'a, Into = S, Item = (&'a [u8], u64)>, S: 'f + for<'a> Streamer<'a, Item = (&'a [u8], u64)>,

source§

fn from_iter<T>(it: T) -> Selfwhere T: IntoIterator<Item = I>,

Creates a value from an iterator. Read more

Auto Trait Implementations§

§

impl<'m> !RefUnwindSafe for OpBuilder<'m>

§

impl<'m> !Send for OpBuilder<'m>

§

impl<'m> !Sync for OpBuilder<'m>

§

impl<'m> Unpin for OpBuilder<'m>

§

impl<'m> !UnwindSafe for OpBuilder<'m>

Blanket Implementations§

source§

impl<T> Any for Twhere T: 'static + ?Sized,

source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
source§

impl<T> Borrow<T> for Twhere T: ?Sized,

source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
source§

impl<T> BorrowMut<T> for Twhere T: ?Sized,

source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
source§

impl<T> From<T> for T

source§

fn from(t: T) -> T

Returns the argument unchanged.

source§

impl<T, U> Into<U> for Twhere U: From<T>,

source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

source§

impl<T, U> TryFrom<U> for Twhere U: Into<T>,

§

type Error = Infallible

The type returned in the event of a conversion error.
source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
source§

impl<T, U> TryInto<U> for Twhere U: TryFrom<T>,

§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.