pub struct DFA<T> { /* private fields */ }
Expand description
A sparse deterministic finite automaton (DFA) with variable sized states.
In contrast to a dense::DFA, a sparse DFA uses a more space efficient representation for its transitions. Consequently, sparse DFAs may use much less memory than dense DFAs, but this comes at a price. In particular, reading the more space efficient transitions takes more work, and consequently, searching using a sparse DFA is typically slower than a dense DFA.
A sparse DFA can be built using the default configuration via the
DFA::new
constructor. Otherwise, one can configure various aspects
of a dense DFA via dense::Builder
,
and then convert a dense DFA to a sparse DFA using
dense::DFA::to_sparse
.
In general, a sparse DFA supports all the same search operations as a dense DFA.
Making the choice between a dense and sparse DFA depends on your specific work load. If you can sacrifice a bit of search time performance, then a sparse DFA might be the best choice. In particular, while sparse DFAs are probably always slower than dense DFAs, you may find that they are easily fast enough for your purposes!
Type parameters
A DFA
has one type parameter, T
, which is used to represent the parts
of a sparse DFA. T
is typically a Vec<u8>
or a &[u8]
.
The Automaton
trait
This type implements the Automaton
trait, which means it can be used
for searching. For example:
use regex_automata::{
dfa::{Automaton, sparse::DFA},
HalfMatch,
};
let dfa = DFA::new("foo[0-9]+")?;
let expected = HalfMatch::must(0, 8);
assert_eq!(Some(expected), dfa.find_leftmost_fwd(b"foo12345")?);
Implementations
sourceimpl DFA<Vec<u8>>
impl DFA<Vec<u8>>
sourcepub fn new(pattern: &str) -> Result<DFA<Vec<u8>>, Error>
pub fn new(pattern: &str) -> Result<DFA<Vec<u8>>, Error>
Parse the given regular expression using a default configuration and return the corresponding sparse DFA.
If you want a non-default configuration, then use
the dense::Builder
to set your own configuration, and then call
dense::DFA::to_sparse
to create
a sparse DFA.
Example
use regex_automata::{
dfa::{Automaton, sparse},
HalfMatch,
};
let dfa = sparse::DFA::new("foo[0-9]+bar")?;
let expected = HalfMatch::must(0, 11);
assert_eq!(Some(expected), dfa.find_leftmost_fwd(b"foo12345bar")?);
sourcepub fn new_many<P: AsRef<str>>(patterns: &[P]) -> Result<DFA<Vec<u8>>, Error>
pub fn new_many<P: AsRef<str>>(patterns: &[P]) -> Result<DFA<Vec<u8>>, Error>
Parse the given regular expressions using a default configuration and return the corresponding multi-DFA.
If you want a non-default configuration, then use
the dense::Builder
to set your own configuration, and then call
dense::DFA::to_sparse
to create
a sparse DFA.
Example
use regex_automata::{
dfa::{Automaton, sparse},
HalfMatch,
};
let dfa = sparse::DFA::new_many(&["[0-9]+", "[a-z]+"])?;
let expected = HalfMatch::must(1, 3);
assert_eq!(Some(expected), dfa.find_leftmost_fwd(b"foo12345bar")?);
sourceimpl DFA<Vec<u8>>
impl DFA<Vec<u8>>
sourcepub fn always_match() -> Result<DFA<Vec<u8>>, Error>
pub fn always_match() -> Result<DFA<Vec<u8>>, Error>
Create a new DFA that matches every input.
Example
use regex_automata::{
dfa::{Automaton, sparse},
HalfMatch,
};
let dfa = sparse::DFA::always_match()?;
let expected = HalfMatch::must(0, 0);
assert_eq!(Some(expected), dfa.find_leftmost_fwd(b"")?);
assert_eq!(Some(expected), dfa.find_leftmost_fwd(b"foo")?);
sourcepub fn never_match() -> Result<DFA<Vec<u8>>, Error>
pub fn never_match() -> Result<DFA<Vec<u8>>, Error>
Create a new sparse DFA that never matches any input.
Example
use regex_automata::dfa::{Automaton, sparse};
let dfa = sparse::DFA::never_match()?;
assert_eq!(None, dfa.find_leftmost_fwd(b"")?);
assert_eq!(None, dfa.find_leftmost_fwd(b"foo")?);
sourceimpl<T: AsRef<[u8]>> DFA<T>
impl<T: AsRef<[u8]>> DFA<T>
sourcepub fn as_ref<'a>(&'a self) -> DFA<&'a [u8]>
pub fn as_ref<'a>(&'a self) -> DFA<&'a [u8]>
Cheaply return a borrowed version of this sparse DFA. Specifically, the
DFA returned always uses &[u8]
for its transitions.
sourcepub fn to_owned(&self) -> DFA<Vec<u8>>
pub fn to_owned(&self) -> DFA<Vec<u8>>
Return an owned version of this sparse DFA. Specifically, the DFA
returned always uses Vec<u8>
for its transitions.
Effectively, this returns a sparse DFA whose transitions live on the heap.
sourcepub fn memory_usage(&self) -> usize
pub fn memory_usage(&self) -> usize
Returns the memory usage, in bytes, of this DFA.
The memory usage is computed based on the number of bytes used to represent this DFA.
This does not include the stack size used up by this DFA. To
compute that, use std::mem::size_of::<sparse::DFA>()
.
sourcepub fn has_starts_for_each_pattern(&self) -> bool
pub fn has_starts_for_each_pattern(&self) -> bool
Returns true only if this DFA has starting states for each pattern.
When a DFA has starting states for each pattern, then a search with the
DFA can be configured to only look for anchored matches of a specific
pattern. Specifically, APIs like Automaton::find_earliest_fwd_at
can accept a non-None pattern_id
if and only if this method returns
true. Otherwise, calling find_earliest_fwd_at
will panic.
Note that if the DFA is empty, this always returns false.
sourceimpl<T: AsRef<[u8]>> DFA<T>
impl<T: AsRef<[u8]>> DFA<T>
Routines for converting a sparse DFA to other representations, such as raw bytes suitable for persistent storage.
sourcepub fn to_bytes_little_endian(&self) -> Vec<u8>ⓘNotable traits for Vec<u8, A>impl<A> Write for Vec<u8, A> where
A: Allocator,
pub fn to_bytes_little_endian(&self) -> Vec<u8>ⓘNotable traits for Vec<u8, A>impl<A> Write for Vec<u8, A> where
A: Allocator,
A: Allocator,
Serialize this DFA as raw bytes to a Vec<u8>
in little endian
format.
The written bytes are guaranteed to be deserialized correctly and
without errors in a semver compatible release of this crate by a
DFA
’s deserialization APIs (assuming all other criteria for the
deserialization APIs has been satisfied):
Note that unlike a dense::DFA
’s
serialization methods, this does not add any initial padding to the
returned bytes. Padding isn’t required for sparse DFAs since they have
no alignment requirements.
Example
This example shows how to serialize and deserialize a DFA:
use regex_automata::{
dfa::{Automaton, sparse::DFA},
HalfMatch,
};
// Compile our original DFA.
let original_dfa = DFA::new("foo[0-9]+")?;
// N.B. We use native endianness here to make the example work, but
// using to_bytes_little_endian would work on a little endian target.
let buf = original_dfa.to_bytes_native_endian();
// Even if buf has initial padding, DFA::from_bytes will automatically
// ignore it.
let dfa: DFA<&[u8]> = DFA::from_bytes(&buf)?.0;
let expected = HalfMatch::must(0, 8);
assert_eq!(Some(expected), dfa.find_leftmost_fwd(b"foo12345")?);
sourcepub fn to_bytes_big_endian(&self) -> Vec<u8>ⓘNotable traits for Vec<u8, A>impl<A> Write for Vec<u8, A> where
A: Allocator,
pub fn to_bytes_big_endian(&self) -> Vec<u8>ⓘNotable traits for Vec<u8, A>impl<A> Write for Vec<u8, A> where
A: Allocator,
A: Allocator,
Serialize this DFA as raw bytes to a Vec<u8>
in big endian
format.
The written bytes are guaranteed to be deserialized correctly and
without errors in a semver compatible release of this crate by a
DFA
’s deserialization APIs (assuming all other criteria for the
deserialization APIs has been satisfied):
Note that unlike a dense::DFA
’s
serialization methods, this does not add any initial padding to the
returned bytes. Padding isn’t required for sparse DFAs since they have
no alignment requirements.
Example
This example shows how to serialize and deserialize a DFA:
use regex_automata::{
dfa::{Automaton, sparse::DFA},
HalfMatch,
};
// Compile our original DFA.
let original_dfa = DFA::new("foo[0-9]+")?;
// N.B. We use native endianness here to make the example work, but
// using to_bytes_big_endian would work on a big endian target.
let buf = original_dfa.to_bytes_native_endian();
// Even if buf has initial padding, DFA::from_bytes will automatically
// ignore it.
let dfa: DFA<&[u8]> = DFA::from_bytes(&buf)?.0;
let expected = HalfMatch::must(0, 8);
assert_eq!(Some(expected), dfa.find_leftmost_fwd(b"foo12345")?);
sourcepub fn to_bytes_native_endian(&self) -> Vec<u8>ⓘNotable traits for Vec<u8, A>impl<A> Write for Vec<u8, A> where
A: Allocator,
pub fn to_bytes_native_endian(&self) -> Vec<u8>ⓘNotable traits for Vec<u8, A>impl<A> Write for Vec<u8, A> where
A: Allocator,
A: Allocator,
Serialize this DFA as raw bytes to a Vec<u8>
in native endian
format.
The written bytes are guaranteed to be deserialized correctly and
without errors in a semver compatible release of this crate by a
DFA
’s deserialization APIs (assuming all other criteria for the
deserialization APIs has been satisfied):
Note that unlike a dense::DFA
’s
serialization methods, this does not add any initial padding to the
returned bytes. Padding isn’t required for sparse DFAs since they have
no alignment requirements.
Generally speaking, native endian format should only be used when you know that the target you’re compiling the DFA for matches the endianness of the target on which you’re compiling DFA. For example, if serialization and deserialization happen in the same process or on the same machine. Otherwise, when serializing a DFA for use in a portable environment, you’ll almost certainly want to serialize both a little endian and a big endian version and then load the correct one based on the target’s configuration.
Example
This example shows how to serialize and deserialize a DFA:
use regex_automata::{
dfa::{Automaton, sparse::DFA},
HalfMatch,
};
// Compile our original DFA.
let original_dfa = DFA::new("foo[0-9]+")?;
let buf = original_dfa.to_bytes_native_endian();
// Even if buf has initial padding, DFA::from_bytes will automatically
// ignore it.
let dfa: DFA<&[u8]> = DFA::from_bytes(&buf)?.0;
let expected = HalfMatch::must(0, 8);
assert_eq!(Some(expected), dfa.find_leftmost_fwd(b"foo12345")?);
sourcepub fn write_to_little_endian(
&self,
dst: &mut [u8]
) -> Result<usize, SerializeError>
pub fn write_to_little_endian(
&self,
dst: &mut [u8]
) -> Result<usize, SerializeError>
Serialize this DFA as raw bytes to the given slice, in little endian
format. Upon success, the total number of bytes written to dst
is
returned.
The written bytes are guaranteed to be deserialized correctly and
without errors in a semver compatible release of this crate by a
DFA
’s deserialization APIs (assuming all other criteria for the
deserialization APIs has been satisfied):
Errors
This returns an error if the given destination slice is not big enough
to contain the full serialized DFA. If an error occurs, then nothing
is written to dst
.
Example
This example shows how to serialize and deserialize a DFA without dynamic memory allocation.
use regex_automata::{
dfa::{Automaton, sparse::DFA},
HalfMatch,
};
// Compile our original DFA.
let original_dfa = DFA::new("foo[0-9]+")?;
// Create a 4KB buffer on the stack to store our serialized DFA.
let mut buf = [0u8; 4 * (1<<10)];
// N.B. We use native endianness here to make the example work, but
// using write_to_little_endian would work on a little endian target.
let written = original_dfa.write_to_native_endian(&mut buf)?;
let dfa: DFA<&[u8]> = DFA::from_bytes(&buf[..written])?.0;
let expected = HalfMatch::must(0, 8);
assert_eq!(Some(expected), dfa.find_leftmost_fwd(b"foo12345")?);
sourcepub fn write_to_big_endian(
&self,
dst: &mut [u8]
) -> Result<usize, SerializeError>
pub fn write_to_big_endian(
&self,
dst: &mut [u8]
) -> Result<usize, SerializeError>
Serialize this DFA as raw bytes to the given slice, in big endian
format. Upon success, the total number of bytes written to dst
is
returned.
The written bytes are guaranteed to be deserialized correctly and
without errors in a semver compatible release of this crate by a
DFA
’s deserialization APIs (assuming all other criteria for the
deserialization APIs has been satisfied):
Errors
This returns an error if the given destination slice is not big enough
to contain the full serialized DFA. If an error occurs, then nothing
is written to dst
.
Example
This example shows how to serialize and deserialize a DFA without dynamic memory allocation.
use regex_automata::{
dfa::{Automaton, sparse::DFA},
HalfMatch,
};
// Compile our original DFA.
let original_dfa = DFA::new("foo[0-9]+")?;
// Create a 4KB buffer on the stack to store our serialized DFA.
let mut buf = [0u8; 4 * (1<<10)];
// N.B. We use native endianness here to make the example work, but
// using write_to_big_endian would work on a big endian target.
let written = original_dfa.write_to_native_endian(&mut buf)?;
let dfa: DFA<&[u8]> = DFA::from_bytes(&buf[..written])?.0;
let expected = HalfMatch::must(0, 8);
assert_eq!(Some(expected), dfa.find_leftmost_fwd(b"foo12345")?);
sourcepub fn write_to_native_endian(
&self,
dst: &mut [u8]
) -> Result<usize, SerializeError>
pub fn write_to_native_endian(
&self,
dst: &mut [u8]
) -> Result<usize, SerializeError>
Serialize this DFA as raw bytes to the given slice, in native endian
format. Upon success, the total number of bytes written to dst
is
returned.
The written bytes are guaranteed to be deserialized correctly and
without errors in a semver compatible release of this crate by a
DFA
’s deserialization APIs (assuming all other criteria for the
deserialization APIs has been satisfied):
Generally speaking, native endian format should only be used when you know that the target you’re compiling the DFA for matches the endianness of the target on which you’re compiling DFA. For example, if serialization and deserialization happen in the same process or on the same machine. Otherwise, when serializing a DFA for use in a portable environment, you’ll almost certainly want to serialize both a little endian and a big endian version and then load the correct one based on the target’s configuration.
Errors
This returns an error if the given destination slice is not big enough
to contain the full serialized DFA. If an error occurs, then nothing
is written to dst
.
Example
This example shows how to serialize and deserialize a DFA without dynamic memory allocation.
use regex_automata::{
dfa::{Automaton, sparse::DFA},
HalfMatch,
};
// Compile our original DFA.
let original_dfa = DFA::new("foo[0-9]+")?;
// Create a 4KB buffer on the stack to store our serialized DFA.
let mut buf = [0u8; 4 * (1<<10)];
let written = original_dfa.write_to_native_endian(&mut buf)?;
let dfa: DFA<&[u8]> = DFA::from_bytes(&buf[..written])?.0;
let expected = HalfMatch::must(0, 8);
assert_eq!(Some(expected), dfa.find_leftmost_fwd(b"foo12345")?);
sourcepub fn write_to_len(&self) -> usize
pub fn write_to_len(&self) -> usize
Return the total number of bytes required to serialize this DFA.
This is useful for determining the size of the buffer required to pass to one of the serialization routines:
Passing a buffer smaller than the size returned by this method will result in a serialization error.
Example
This example shows how to dynamically allocate enough room to serialize a sparse DFA.
use regex_automata::{
dfa::{Automaton, sparse::DFA},
HalfMatch,
};
// Compile our original DFA.
let original_dfa = DFA::new("foo[0-9]+")?;
let mut buf = vec![0; original_dfa.write_to_len()];
let written = original_dfa.write_to_native_endian(&mut buf)?;
let dfa: DFA<&[u8]> = DFA::from_bytes(&buf[..written])?.0;
let expected = HalfMatch::must(0, 8);
assert_eq!(Some(expected), dfa.find_leftmost_fwd(b"foo12345")?);
sourceimpl<'a> DFA<&'a [u8]>
impl<'a> DFA<&'a [u8]>
sourcepub fn from_bytes(
slice: &'a [u8]
) -> Result<(DFA<&'a [u8]>, usize), DeserializeError>
pub fn from_bytes(
slice: &'a [u8]
) -> Result<(DFA<&'a [u8]>, usize), DeserializeError>
Safely deserialize a sparse DFA with a specific state identifier representation. Upon success, this returns both the deserialized DFA and the number of bytes read from the given slice. Namely, the contents of the slice beyond the DFA are not read.
Deserializing a DFA using this routine will never allocate heap memory.
For safety purposes, the DFA’s transitions will be verified such that
every transition points to a valid state. If this verification is too
costly, then a DFA::from_bytes_unchecked
API is provided, which
will always execute in constant time.
The bytes given must be generated by one of the serialization APIs
of a DFA
using a semver compatible release of this crate. Those
include:
DFA::to_bytes_little_endian
DFA::to_bytes_big_endian
DFA::to_bytes_native_endian
DFA::write_to_little_endian
DFA::write_to_big_endian
DFA::write_to_native_endian
The to_bytes
methods allocate and return a Vec<u8>
for you. The
write_to
methods do not allocate and write to an existing slice
(which may be on the stack). Since deserialization always uses the
native endianness of the target platform, the serialization API you use
should match the endianness of the target platform. (It’s often a good
idea to generate serialized DFAs for both forms of endianness and then
load the correct one based on endianness.)
Errors
Generally speaking, it’s easier to state the conditions in which an error is not returned. All of the following must be true:
- The bytes given must be produced by one of the serialization APIs on this DFA, as mentioned above.
- The endianness of the target platform matches the endianness used to serialized the provided DFA.
If any of the above are not true, then an error will be returned.
Note that unlike deserializing a
dense::DFA
, deserializing a sparse DFA has
no alignment requirements. That is, an alignment of 1
is valid.
Panics
This routine will never panic for any input.
Example
This example shows how to serialize a DFA to raw bytes, deserialize it and then use it for searching.
use regex_automata::{
dfa::{Automaton, sparse::DFA},
HalfMatch,
};
let initial = DFA::new("foo[0-9]+")?;
let bytes = initial.to_bytes_native_endian();
let dfa: DFA<&[u8]> = DFA::from_bytes(&bytes)?.0;
let expected = HalfMatch::must(0, 8);
assert_eq!(Some(expected), dfa.find_leftmost_fwd(b"foo12345")?);
Example: loading a DFA from static memory
One use case this library supports is the ability to serialize a
DFA to disk and then use include_bytes!
to store it in a compiled
Rust program. Those bytes can then be cheaply deserialized into a
DFA
structure at runtime and used for searching without having to
re-compile the DFA (which can be quite costly).
We can show this in two parts. The first part is serializing the DFA to a file:
use regex_automata::dfa::{Automaton, sparse::DFA};
let dfa = DFA::new("foo[0-9]+")?;
// Write a big endian serialized version of this DFA to a file.
let bytes = dfa.to_bytes_big_endian();
std::fs::write("foo.bigendian.dfa", &bytes)?;
// Do it again, but this time for little endian.
let bytes = dfa.to_bytes_little_endian();
std::fs::write("foo.littleendian.dfa", &bytes)?;
And now the second part is embedding the DFA into the compiled program and deserializing it at runtime on first use. We use conditional compilation to choose the correct endianness. As mentioned above, we do not need to employ any special tricks to ensure a proper alignment, since a sparse DFA has no alignment requirements.
use regex_automata::{
dfa::{Automaton, sparse},
HalfMatch,
};
type DFA = sparse::DFA<&'static [u8]>;
fn get_foo() -> &'static DFA {
use std::cell::Cell;
use std::mem::MaybeUninit;
use std::sync::Once;
#[cfg(target_endian = "big")]
static BYTES: &[u8] = include_bytes!("foo.bigendian.dfa");
#[cfg(target_endian = "little")]
static BYTES: &[u8] = include_bytes!("foo.littleendian.dfa");
struct Lazy(Cell<MaybeUninit<DFA>>);
// SAFETY: This is safe because DFA impls Sync.
unsafe impl Sync for Lazy {}
static INIT: Once = Once::new();
static DFA: Lazy = Lazy(Cell::new(MaybeUninit::uninit()));
INIT.call_once(|| {
let (dfa, _) = DFA::from_bytes(BYTES)
.expect("serialized DFA should be valid");
// SAFETY: This is guaranteed to only execute once, and all
// we do with the pointer is write the DFA to it.
unsafe {
(*DFA.0.as_ptr()).as_mut_ptr().write(dfa);
}
});
// SAFETY: DFA is guaranteed to by initialized via INIT and is
// stored in static memory.
unsafe {
let dfa = (*DFA.0.as_ptr()).as_ptr();
std::mem::transmute::<*const DFA, &'static DFA>(dfa)
}
}
let dfa = get_foo();
let expected = HalfMatch::must(0, 8);
assert_eq!(Ok(Some(expected)), dfa.find_leftmost_fwd(b"foo12345"));
Alternatively, consider using
lazy_static
or
once_cell
,
which will guarantee safety for you.
sourcepub unsafe fn from_bytes_unchecked(
slice: &'a [u8]
) -> Result<(DFA<&'a [u8]>, usize), DeserializeError>
pub unsafe fn from_bytes_unchecked(
slice: &'a [u8]
) -> Result<(DFA<&'a [u8]>, usize), DeserializeError>
Deserialize a DFA with a specific state identifier representation in constant time by omitting the verification of the validity of the sparse transitions.
This is just like DFA::from_bytes
, except it can potentially return
a DFA that exhibits undefined behavior if its transitions contains
invalid state identifiers.
This routine is useful if you need to deserialize a DFA cheaply and
cannot afford the transition validation performed by from_bytes
.
Safety
This routine is unsafe because it permits callers to provide
arbitrary transitions with possibly incorrect state identifiers. While
the various serialization routines will never return an incorrect
DFA, there is no guarantee that the bytes provided here
are correct. While from_bytes_unchecked
will still do several forms
of basic validation, this routine does not check that the transitions
themselves are correct. Given an incorrect transition table, it is
possible for the search routines to access out-of-bounds memory because
of explicit bounds check elision.
Example
use regex_automata::{
dfa::{Automaton, sparse::DFA},
HalfMatch,
};
let initial = DFA::new("foo[0-9]+")?;
let bytes = initial.to_bytes_native_endian();
// SAFETY: This is guaranteed to be safe since the bytes given come
// directly from a compatible serialization routine.
let dfa: DFA<&[u8]> = unsafe { DFA::from_bytes_unchecked(&bytes)?.0 };
let expected = HalfMatch::must(0, 8);
assert_eq!(Some(expected), dfa.find_leftmost_fwd(b"foo12345")?);
Trait Implementations
sourceimpl<T: AsRef<[u8]>> Automaton for DFA<T>
impl<T: AsRef<[u8]>> Automaton for DFA<T>
sourcefn is_special_state(&self, id: StateID) -> bool
fn is_special_state(&self, id: StateID) -> bool
Returns true if and only if the given identifier corresponds to a “special” state. A special state is one or more of the following: a dead state, a quit state, a match state, a start state or an accelerated state. Read more
sourcefn is_dead_state(&self, id: StateID) -> bool
fn is_dead_state(&self, id: StateID) -> bool
Returns true if and only if the given identifier corresponds to a dead state. When a DFA enters a dead state, it is impossible to leave. That is, every transition on a dead state by definition leads back to the same dead state. Read more
sourcefn is_quit_state(&self, id: StateID) -> bool
fn is_quit_state(&self, id: StateID) -> bool
Returns true if and only if the given identifier corresponds to a quit state. A quit state is like a dead state (it has no transitions other than to itself), except it indicates that the DFA failed to complete the search. When this occurs, callers can neither accept or reject that a match occurred. Read more
sourcefn is_match_state(&self, id: StateID) -> bool
fn is_match_state(&self, id: StateID) -> bool
Returns true if and only if the given identifier corresponds to a match state. A match state is also referred to as a “final” state and indicates that a match has been found. Read more
sourcefn is_start_state(&self, id: StateID) -> bool
fn is_start_state(&self, id: StateID) -> bool
Returns true if and only if the given identifier corresponds to a start state. A start state is a state in which a DFA begins a search. All searches begin in a start state. Moreover, since all matches are delayed by one byte, a start state can never be a match state. Read more
sourcefn is_accel_state(&self, id: StateID) -> bool
fn is_accel_state(&self, id: StateID) -> bool
Returns true if and only if the given identifier corresponds to an accelerated state. Read more
sourcefn next_state(&self, current: StateID, input: u8) -> StateID
fn next_state(&self, current: StateID, input: u8) -> StateID
Transitions from the current state to the next state, given the next byte of input. Read more
sourceunsafe fn next_state_unchecked(&self, current: StateID, input: u8) -> StateID
unsafe fn next_state_unchecked(&self, current: StateID, input: u8) -> StateID
Transitions from the current state to the next state, given the next byte of input. Read more
sourcefn next_eoi_state(&self, current: StateID) -> StateID
fn next_eoi_state(&self, current: StateID) -> StateID
Transitions from the current state to the next state for the special EOI symbol. Read more
sourcefn pattern_count(&self) -> usize
fn pattern_count(&self) -> usize
Returns the total number of patterns compiled into this DFA. Read more
sourcefn match_count(&self, id: StateID) -> usize
fn match_count(&self, id: StateID) -> usize
Returns the total number of patterns that match in this state. Read more
sourcefn match_pattern(&self, id: StateID, match_index: usize) -> PatternID
fn match_pattern(&self, id: StateID, match_index: usize) -> PatternID
Returns the pattern ID corresponding to the given match index in the given state. Read more
sourcefn start_state_forward(
&self,
pattern_id: Option<PatternID>,
bytes: &[u8],
start: usize,
end: usize
) -> StateID
fn start_state_forward(
&self,
pattern_id: Option<PatternID>,
bytes: &[u8],
start: usize,
end: usize
) -> StateID
Return the ID of the start state for this DFA when executing a forward search. Read more
sourcefn start_state_reverse(
&self,
pattern_id: Option<PatternID>,
bytes: &[u8],
start: usize,
end: usize
) -> StateID
fn start_state_reverse(
&self,
pattern_id: Option<PatternID>,
bytes: &[u8],
start: usize,
end: usize
) -> StateID
Return the ID of the start state for this DFA when executing a reverse search. Read more
sourcefn accelerator(&self, id: StateID) -> &[u8]ⓘNotable traits for &'_ mut [u8]impl<'_> Write for &'_ mut [u8]impl<'_> Read for &'_ [u8]
fn accelerator(&self, id: StateID) -> &[u8]ⓘNotable traits for &'_ mut [u8]impl<'_> Write for &'_ mut [u8]impl<'_> Read for &'_ [u8]
Return a slice of bytes to accelerate for the given state, if possible. Read more
sourcefn find_earliest_fwd(
&self,
bytes: &[u8]
) -> Result<Option<HalfMatch>, MatchError>
fn find_earliest_fwd(
&self,
bytes: &[u8]
) -> Result<Option<HalfMatch>, MatchError>
Executes a forward search and returns the end position of the first
match that is found as early as possible. If no match exists, then
None
is returned. Read more
sourcefn find_earliest_rev(
&self,
bytes: &[u8]
) -> Result<Option<HalfMatch>, MatchError>
fn find_earliest_rev(
&self,
bytes: &[u8]
) -> Result<Option<HalfMatch>, MatchError>
Executes a reverse search and returns the start position of the first
match that is found as early as possible. If no match exists, then
None
is returned. Read more
sourcefn find_leftmost_fwd(
&self,
bytes: &[u8]
) -> Result<Option<HalfMatch>, MatchError>
fn find_leftmost_fwd(
&self,
bytes: &[u8]
) -> Result<Option<HalfMatch>, MatchError>
Executes a forward search and returns the end position of the leftmost
match that is found. If no match exists, then None
is returned. Read more
sourcefn find_leftmost_rev(
&self,
bytes: &[u8]
) -> Result<Option<HalfMatch>, MatchError>
fn find_leftmost_rev(
&self,
bytes: &[u8]
) -> Result<Option<HalfMatch>, MatchError>
Executes a reverse search and returns the start of the position of the
leftmost match that is found. If no match exists, then None
is
returned. Read more
sourcefn find_overlapping_fwd(
&self,
bytes: &[u8],
state: &mut OverlappingState
) -> Result<Option<HalfMatch>, MatchError>
fn find_overlapping_fwd(
&self,
bytes: &[u8],
state: &mut OverlappingState
) -> Result<Option<HalfMatch>, MatchError>
Executes an overlapping forward search and returns the end position of
matches as they are found. If no match exists, then None
is returned. Read more
sourcefn find_earliest_fwd_at(
&self,
pre: Option<&mut Scanner<'_>>,
pattern_id: Option<PatternID>,
bytes: &[u8],
start: usize,
end: usize
) -> Result<Option<HalfMatch>, MatchError>
fn find_earliest_fwd_at(
&self,
pre: Option<&mut Scanner<'_>>,
pattern_id: Option<PatternID>,
bytes: &[u8],
start: usize,
end: usize
) -> Result<Option<HalfMatch>, MatchError>
Executes a forward search and returns the end position of the first
match that is found as early as possible. If no match exists, then
None
is returned. Read more
sourcefn find_earliest_rev_at(
&self,
pattern_id: Option<PatternID>,
bytes: &[u8],
start: usize,
end: usize
) -> Result<Option<HalfMatch>, MatchError>
fn find_earliest_rev_at(
&self,
pattern_id: Option<PatternID>,
bytes: &[u8],
start: usize,
end: usize
) -> Result<Option<HalfMatch>, MatchError>
Executes a reverse search and returns the start position of the first
match that is found as early as possible. If no match exists, then
None
is returned. Read more
sourcefn find_leftmost_fwd_at(
&self,
pre: Option<&mut Scanner<'_>>,
pattern_id: Option<PatternID>,
bytes: &[u8],
start: usize,
end: usize
) -> Result<Option<HalfMatch>, MatchError>
fn find_leftmost_fwd_at(
&self,
pre: Option<&mut Scanner<'_>>,
pattern_id: Option<PatternID>,
bytes: &[u8],
start: usize,
end: usize
) -> Result<Option<HalfMatch>, MatchError>
Executes a forward search and returns the end position of the leftmost
match that is found. If no match exists, then None
is returned. Read more
sourcefn find_leftmost_rev_at(
&self,
pattern_id: Option<PatternID>,
bytes: &[u8],
start: usize,
end: usize
) -> Result<Option<HalfMatch>, MatchError>
fn find_leftmost_rev_at(
&self,
pattern_id: Option<PatternID>,
bytes: &[u8],
start: usize,
end: usize
) -> Result<Option<HalfMatch>, MatchError>
Executes a reverse search and returns the start of the position of the
leftmost match that is found. If no match exists, then None
is
returned. Read more
sourcefn find_overlapping_fwd_at(
&self,
pre: Option<&mut Scanner<'_>>,
pattern_id: Option<PatternID>,
bytes: &[u8],
start: usize,
end: usize,
state: &mut OverlappingState
) -> Result<Option<HalfMatch>, MatchError>
fn find_overlapping_fwd_at(
&self,
pre: Option<&mut Scanner<'_>>,
pattern_id: Option<PatternID>,
bytes: &[u8],
start: usize,
end: usize,
state: &mut OverlappingState
) -> Result<Option<HalfMatch>, MatchError>
Executes an overlapping forward search and returns the end position of
matches as they are found. If no match exists, then None
is returned. Read more
Auto Trait Implementations
impl<T> RefUnwindSafe for DFA<T> where
T: RefUnwindSafe,
impl<T> Send for DFA<T> where
T: Send,
impl<T> Sync for DFA<T> where
T: Sync,
impl<T> Unpin for DFA<T> where
T: Unpin,
impl<T> UnwindSafe for DFA<T> where
T: UnwindSafe,
Blanket Implementations
sourceimpl<T> BorrowMut<T> for T where
T: ?Sized,
impl<T> BorrowMut<T> for T where
T: ?Sized,
const: unstable · sourcefn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more
sourceimpl<T> ToOwned for T where
T: Clone,
impl<T> ToOwned for T where
T: Clone,
type Owned = T
type Owned = T
The resulting type after obtaining ownership.
sourcefn clone_into(&self, target: &mut T)
fn clone_into(&self, target: &mut T)
toowned_clone_into
)Uses borrowed data to replace owned data, usually by cloning. Read more