tantivy_fst/
stream.rs

1/// Streamer describes a "streaming iterator."
2///
3/// It provides a mechanism for writing code that is generic over streams
4/// produced by this crate.
5///
6/// Note that this is strictly less useful than `Iterator` because the item
7/// associated type is bound to a specific lifetime. However, this does permit
8/// us to write *some* generic code over streams that produce values tied
9/// to the lifetime of the stream.
10///
11/// Some form of stream abstraction is inherently required for this crate
12/// because elements in a finite state transducer are produced *by iterating*
13/// over the structure. The alternative would be to create a new allocation
14/// for each element iterated over, which would be prohibitively expensive.
15///
16/// # Usage & motivation
17///
18/// Streams are hard to use because they don't fit into Rust's current type
19/// system very well. They are so hard to use that this author loathes having a
20/// publically defined trait for it. Nevertheless, they do just barely provide
21/// a means for composing multiple stream abstractions with different concrete
22/// types. For example, one might want to take the union of a range query
23/// stream with a stream that has been filtered by a regex. These streams have
24/// different concrete types. A `Streamer` trait allows us to write code that
25/// is generic over these concrete types. (All of the set operations are
26/// implemented this way.)
27///
28/// A problem with streams is that the trait is itself parameterized by a
29/// lifetime. In practice, this makes them very unergonomic because specifying
30/// a `Streamer` bound generally requires a higher-ranked trait bound. This is
31/// necessary because the lifetime can't actually be named in the enclosing
32/// function; instead, the lifetime is local to iteration itself. Therefore,
33/// one must assert that the bound is valid for *any particular* lifetime.
34/// This is the essence of higher-rank trait bounds.
35///
36/// Because of this, you might expect to see lots of bounds that look like
37/// this:
38///
39/// ```ignore
40/// fn takes_stream<T, S>(s: S)
41///     where S: for<'a> Streamer<'a, Item=T>
42/// {
43/// }
44/// ```
45///
46/// There are *three* different problems with this declaration:
47///
48/// 1. `S` is not bound by any particular lifetime itself, and most streams
49///    probably contain a reference to an underlying finite state transducer.
50/// 2. It is often convenient to separate the notion of "stream" with
51///    "stream constructor." This represents a similar split found in the
52///    standard library for `Iterator` and `IntoIterator`, respectively.
53/// 3. The `Item=T` is invalid because `Streamer`'s associated type is
54///    parameterized by a lifetime and there is no way to parameterize an
55///    arbitrary type constructor. (In this context, `T` is the type
56///    constructor, because it will invariably require a lifetime to become
57///    a concrete type.)
58///
59/// With that said, we must revise our possibly-workable bounds to a giant
60/// scary monster:
61///
62/// ```ignore
63/// fn takes_stream<'f, I, S>(s: I)
64///     where I: for<'a> IntoStreamer<'a, Into=S, Item=(&'a [u8], Output)>,
65///           S: 'f + for<'a> Streamer<'a, Item=(&'a [u8], Output)>
66/// {
67/// }
68/// ```
69///
70/// We addressed the above points correspondingly:
71///
72/// 1. `S` is now bound by `'f`, which corresponds to the lifetime (possibly
73///     `'static`) of the underlying stream.
74/// 2. The `I` type parameter has been added to refer to a type that knows how
75///    to build a stream. Notice that neither of the bounds for `I` or `S`
76///    share a lifetime parameter. This is because the higher rank trait bound
77///    specifies it works for *any* particular lifetime.
78/// 3. `T` has been replaced with specific concrete types. Note that these
79///    concrete types are duplicated. With iterators, we could use
80///    `Item=S::Item` in the bound for `I`, but one cannot access an associated
81///    type through a higher-ranked trait bound. Therefore, we must duplicate
82///    the item type.
83///
84/// As you can see, streams offer little flexibility, little ergonomics and a
85/// lot of hard to read trait bounds. The situation is lamentable, but
86/// nevertheless, without them, we would not be able to compose streams by
87/// leveraging the type system.
88///
89/// A redeemable quality is that these *same exact* trait bounds (modulo some
90/// tweaks in the `Item` associated type) appear in many places in this crate
91/// without much variation. Therefore, once you grok it, it's mostly easy to
92/// pattern match it with "oh I need a stream." My hope is that clear
93/// documentation and examples make these complex bounds easier to burden.
94///
95/// Stretching this abstraction further with Rust's current type system is not
96/// advised.
97pub trait Streamer<'a> {
98    /// The type of the item emitted by this stream.
99    type Item: 'a;
100
101    /// Emits the next element in this stream, or `None` to indicate the stream
102    /// has been exhausted.
103    ///
104    /// It is not specified what a stream does after `None` is emitted. In most
105    /// cases, `None` should be emitted on every subsequent call.
106    fn next(&'a mut self) -> Option<Self::Item>;
107}
108
109/// IntoStreamer describes types that can be converted to streams.
110///
111/// This is analogous to the `IntoIterator` trait for `Iterator` in
112/// `std::iter`.
113pub trait IntoStreamer<'a> {
114    /// The type of the item emitted by the stream.
115    type Item: 'a;
116    /// The type of the stream to be constructed.
117    type Into: Streamer<'a, Item = Self::Item>;
118
119    /// Construct a stream from `Self`.
120    fn into_stream(self) -> Self::Into;
121}
122
123impl<'a, S: Streamer<'a>> IntoStreamer<'a> for S {
124    type Item = S::Item;
125    type Into = S;
126
127    fn into_stream(self) -> S {
128        self
129    }
130}