[−][src]Trait tantivy_fst::Automaton
Automaton describes types that behave as a finite automaton.
All implementors of this trait are represented by byte based automata. Stated differently, all transitions in the automata correspond to a single byte in the input.
This implementation choice is important for a couple reasons:
- The set of possible transitions in each node is small, which may make efficient memory usage easier.
- The finite state transducers in this crate are all byte based, so any automata used on them must also be byte based.
In practice, this does present somewhat of a problem, for example, if
you're storing UTF-8 encoded strings in a finite state transducer. Consider
using a Levenshtein
automaton, which accepts a query string and an edit
distance. The edit distance should apply to some notion of character,
which could be represented by at least 1-4 bytes in a UTF-8 encoding (for
some definition of "character"). Therefore, the automaton must have UTF-8
decoding built into it. This can be tricky to implement, so you may find
the utf8-ranges
crate useful.
Associated Types
type State
The type of the state used in the automaton.
Required methods
fn start(&self) -> Self::State
Returns a single start state for this automaton.
This method should always return the same value for each implementation.
fn is_match(&self, state: &Self::State) -> bool
Returns true if and only if state
is a match state.
fn accept(&self, state: &Self::State, byte: u8) -> Self::State
Return the next state given state
and an input.
Provided methods
fn can_match(&self, _state: &Self::State) -> bool
Returns true if and only if state
can lead to a match in zero or more
steps.
If this returns false
, then no sequence of inputs from this state
should ever produce a match. If this does not follow, then those match
states may never be reached. In other words, behavior may be incorrect.
If this returns true
even when no match is possible, then behavior
will be correct, but callers may be forced to do additional work.
fn will_always_match(&self, _state: &Self::State) -> bool
Returns true if and only if state
matches and must match no matter
what steps are taken.
If this returns true
, then every sequence of inputs from this state
produces a match. If this does not follow, then those match states may
never be reached. In other words, behavior may be incorrect.
If this returns false
even when every sequence of inputs will lead to
a match, then behavior will be correct, but callers may be forced to do
additional work.
fn starts_with(self) -> StartsWith<Self> where
Self: Sized,
Self: Sized,
Returns an automaton that matches the strings that start with something this automaton matches.
fn union<Rhs: Automaton>(self, rhs: Rhs) -> Union<Self, Rhs> where
Self: Sized,
Self: Sized,
Returns an automaton that matches the strings matched by either this or the other automaton.
fn intersection<Rhs: Automaton>(self, rhs: Rhs) -> Intersection<Self, Rhs> where
Self: Sized,
Self: Sized,
Returns an automaton that matches the strings matched by both this and the other automaton.
fn complement(self) -> Complement<Self> where
Self: Sized,
Self: Sized,
Returns an automaton that matches the strings not matched by this automaton.
Implementations on Foreign Types
impl<'a, T: Automaton> Automaton for &'a T
[src]
type State = T::State
fn start(&self) -> Self::State
[src]
fn is_match(&self, state: &Self::State) -> bool
[src]
fn can_match(&self, state: &Self::State) -> bool
[src]
fn will_always_match(&self, state: &Self::State) -> bool
[src]
fn accept(&self, state: &Self::State, byte: u8) -> Self::State
[src]
Implementors
impl Automaton for AlwaysMatch
[src]
type State = ()
fn start(&self)
[src]
fn is_match(&self, _: &()) -> bool
[src]
fn can_match(&self, _: &()) -> bool
[src]
fn will_always_match(&self, _: &()) -> bool
[src]
fn accept(&self, _: &(), _: u8)
[src]
impl Automaton for Regex
[src]
type State = Option<usize>
fn start(&self) -> Option<usize>
[src]
fn is_match(&self, state: &Option<usize>) -> bool
[src]
fn can_match(&self, state: &Option<usize>) -> bool
[src]
fn accept(&self, state: &Option<usize>, byte: u8) -> Option<usize>
[src]
impl<'a> Automaton for Subsequence<'a>
[src]
type State = usize
fn start(&self) -> usize
[src]
fn is_match(&self, state: &usize) -> bool
[src]
fn can_match(&self, _: &usize) -> bool
[src]
fn will_always_match(&self, state: &usize) -> bool
[src]
fn accept(&self, state: &usize, byte: u8) -> usize
[src]
impl<A: Automaton> Automaton for Complement<A>
[src]
type State = ComplementState<A>
fn start(&self) -> ComplementState<A>
[src]
fn is_match(&self, state: &ComplementState<A>) -> bool
[src]
fn can_match(&self, state: &ComplementState<A>) -> bool
[src]
fn will_always_match(&self, state: &ComplementState<A>) -> bool
[src]
fn accept(&self, state: &ComplementState<A>, byte: u8) -> ComplementState<A>
[src]
impl<A: Automaton> Automaton for StartsWith<A>
[src]
type State = StartsWithState<A>
fn start(&self) -> StartsWithState<A>
[src]
fn is_match(&self, state: &StartsWithState<A>) -> bool
[src]
fn can_match(&self, state: &StartsWithState<A>) -> bool
[src]
fn will_always_match(&self, state: &StartsWithState<A>) -> bool
[src]
fn accept(&self, state: &StartsWithState<A>, byte: u8) -> StartsWithState<A>
[src]
impl<A: Automaton, B: Automaton> Automaton for Intersection<A, B>
[src]
type State = IntersectionState<A, B>
fn start(&self) -> IntersectionState<A, B>
[src]
fn is_match(&self, state: &IntersectionState<A, B>) -> bool
[src]
fn can_match(&self, state: &IntersectionState<A, B>) -> bool
[src]
fn will_always_match(&self, state: &IntersectionState<A, B>) -> bool
[src]
fn accept(
&self,
state: &IntersectionState<A, B>,
byte: u8
) -> IntersectionState<A, B>
[src]
&self,
state: &IntersectionState<A, B>,
byte: u8
) -> IntersectionState<A, B>