[−][src]Struct tantivy_fst::MapBuilder
A builder for creating a map.
This is not your average everyday builder. It has two important qualities that make it a bit unique from what you might expect:
- All keys must be added in lexicographic order. Adding a key out of order will result in an error. Additionally, adding a duplicate key will also result in an error. That is, once a key is associated with a value, that association can never be modified or deleted.
- The representation of a map is streamed to any
io::Write
as it is built. For an in memory representation, this can be aVec<u8>
.
Point (2) is especially important because it means that a map can be
constructed without storing the entire map in memory. Namely, since it
works with any io::Write
, it can be streamed directly to a file.
With that said, the builder does use memory, but memory usage is bounded to a constant size. The amount of memory used trades off with the compression ratio. Currently, the implementation hard codes this trade off which can result in about 5-20MB of heap usage during construction. (N.B. Guaranteeing a maximal compression ratio requires memory proportional to the size of the map, which defeats some of the benefit of streaming it to disk. In practice, a small bounded amount of memory achieves close-to-minimal compression ratios.)
The algorithmic complexity of map construction is O(n)
where n
is the
number of elements added to the map.
Example: build in memory
This shows how to use the builder to construct a map in memory. Note that
Map::from_iter
provides a convenience function that achieves this same
goal without needing to explicitly use MapBuilder
.
use tantivy_fst::{IntoStreamer, Streamer, Map, MapBuilder}; let mut build = MapBuilder::memory(); build.insert("bruce", 1).unwrap(); build.insert("clarence", 2).unwrap(); build.insert("stevie", 3).unwrap(); // You could also call `finish()` here, but since we're building the map in // memory, there would be no way to get the `Vec<u8>` back. let bytes = build.into_inner().unwrap(); // At this point, the map has been constructed, but here's how to read it. let map = Map::from_bytes(bytes).unwrap(); let mut stream = map.into_stream(); let mut kvs = vec![]; while let Some((k, v)) = stream.next() { kvs.push((k.to_vec(), v)); } assert_eq!(kvs, vec![ (b"bruce".to_vec(), 1), (b"clarence".to_vec(), 2), (b"stevie".to_vec(), 3), ]);
Methods
impl MapBuilder<Vec<u8>>
[src]
impl<W: Write> MapBuilder<W>
[src]
pub fn new(wtr: W) -> Result<MapBuilder<W>>
[src]
Create a builder that builds a map by writing it to wtr
in a
streaming fashion.
pub fn insert<K: AsRef<[u8]>>(&mut self, key: K, val: u64) -> Result<()>
[src]
Insert a new key-value pair into the map.
Keys must be convertible to byte strings. Values must be a u64
, which
is a restriction of the current implementation of finite state
transducers. (Values may one day be expanded to other types.)
If a key is inserted that is less than or equal to any previous key added, then an error is returned. Similarly, if there was a problem writing to the underlying writer, an error is returned.
pub fn extend_iter<K, I>(&mut self, iter: I) -> Result<()> where
K: AsRef<[u8]>,
I: IntoIterator<Item = (K, u64)>,
[src]
K: AsRef<[u8]>,
I: IntoIterator<Item = (K, u64)>,
Calls insert on each item in the iterator.
If an error occurred while adding an element, processing is stopped and the error is returned.
If a key is inserted that is less than or equal to any previous key added, then an error is returned. Similarly, if there was a problem writing to the underlying writer, an error is returned.
pub fn extend_stream<'f, I, S>(&mut self, stream: I) -> Result<()> where
I: for<'a> IntoStreamer<'a, Into = S, Item = (&'a [u8], u64)>,
S: 'f + for<'a> Streamer<'a, Item = (&'a [u8], u64)>,
[src]
I: for<'a> IntoStreamer<'a, Into = S, Item = (&'a [u8], u64)>,
S: 'f + for<'a> Streamer<'a, Item = (&'a [u8], u64)>,
Calls insert on each item in the stream.
Note that unlike extend_iter
, this is not generic on the items in
the stream.
If a key is inserted that is less than or equal to any previous key added, then an error is returned. Similarly, if there was a problem writing to the underlying writer, an error is returned.
pub fn finish(self) -> Result<()>
[src]
Finishes the construction of the map and flushes the underlying
writer. After completion, the data written to W
may be read using
one of Map
's constructor methods.
pub fn into_inner(self) -> Result<W>
[src]
Just like finish
, except it returns the underlying writer after
flushing it.
pub fn get_ref(&self) -> &W
[src]
Gets a reference to the underlying writer.
pub fn bytes_written(&self) -> u64
[src]
Returns the number of bytes written to the underlying writer
Auto Trait Implementations
impl<W> RefUnwindSafe for MapBuilder<W> where
W: RefUnwindSafe,
W: RefUnwindSafe,
impl<W> Send for MapBuilder<W> where
W: Send,
W: Send,
impl<W> Sync for MapBuilder<W> where
W: Sync,
W: Sync,
impl<W> Unpin for MapBuilder<W> where
W: Unpin,
W: Unpin,
impl<W> UnwindSafe for MapBuilder<W> where
W: UnwindSafe,
W: UnwindSafe,
Blanket Implementations
impl<T> Any for T where
T: 'static + ?Sized,
[src]
T: 'static + ?Sized,
impl<T> Borrow<T> for T where
T: ?Sized,
[src]
T: ?Sized,
impl<T> BorrowMut<T> for T where
T: ?Sized,
[src]
T: ?Sized,
fn borrow_mut(&mut self) -> &mut T
[src]
impl<T> From<T> for T
[src]
impl<T, U> Into<U> for T where
U: From<T>,
[src]
U: From<T>,
impl<T, U> TryFrom<U> for T where
U: Into<T>,
[src]
U: Into<T>,
type Error = Infallible
The type returned in the event of a conversion error.
fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>
[src]
impl<T, U> TryInto<U> for T where
U: TryFrom<T>,
[src]
U: TryFrom<T>,