pub struct Config { /* private fields */ }
Expand description
The configuration used for building a lazy DFA.
As a convenience, DFA::config
is an alias for Config::new
. The
advantage of the former is that it often lets you avoid importing the
Config
type directly.
A lazy DFA configuration is a simple data object that is typically used
with Builder::configure
.
The default configuration guarantees that a search will never return
a MatchError
for any haystack or pattern. Setting a quit byte with
Config::quit
, enabling heuristic support for Unicode word boundaries
with Config::unicode_word_boundary
, or setting a minimum cache clear
count with Config::minimum_cache_clear_count
can in turn cause a search
to return an error. See the corresponding configuration options for more
details on when those error conditions arise.
Implementations
sourceimpl Config
impl Config
sourcepub fn anchored(self, yes: bool) -> Config
pub fn anchored(self, yes: bool) -> Config
Set whether matching must be anchored at the beginning of the input.
When enabled, a match must begin at the start of a search. When
disabled (the default), the lazy DFA will act as if the pattern started
with a (?s:.)*?
, which enables a match to appear anywhere.
Note that if you want to run both anchored and unanchored
searches without building multiple automatons, you can enable the
Config::starts_for_each_pattern
configuration instead. This will
permit unanchored any-pattern searches and pattern-specific anchored
searches. See the documentation for that configuration for an example.
By default this is disabled.
WARNING: this is subtly different than using a ^
at the start of
your regex. A ^
forces a regex to match exclusively at the start of
input, regardless of where you begin your search. In contrast, enabling
this option will allow your regex to match anywhere in your input,
but the match must start at the beginning of a search. (Most of the
higher level convenience search routines make “start of input” and
“start of search” equivalent, but some routines allow treating these as
orthogonal.)
For example, consider the haystack aba
and the following searches:
- The regex
^a
is compiled withanchored=false
and searchesaba
starting at position2
. Since^
requires the match to start at the beginning of the input and2 > 0
, no match is found. - The regex
a
is compiled withanchored=true
and searchesaba
starting at position2
. This reports a match at[2, 3]
since the match starts where the search started. Since there is no^
, there is no requirement for the match to start at the beginning of the input. - The regex
a
is compiled withanchored=true
and searchesaba
starting at position1
. Sinceb
corresponds to position1
and since the regex is anchored, it finds no match. - The regex
a
is compiled withanchored=false
and searchesaba
startting at position1
. Since the regex is neither anchored nor starts with^
, the regex is compiled with an implicit(?s:.)*?
prefix that permits it to match anywhere. Thus, it reports a match at[2, 3]
.
Example
This demonstrates the differences between an anchored search and
a pattern that begins with ^
(as described in the above warning
message).
use regex_automata::{hybrid::dfa::DFA, HalfMatch};
let haystack = "aba".as_bytes();
let dfa = DFA::builder()
.configure(DFA::config().anchored(false)) // default
.build(r"^a")?;
let mut cache = dfa.create_cache();
let got = dfa.find_leftmost_fwd_at(
&mut cache, None, None, haystack, 2, 3,
)?;
// No match is found because 2 is not the beginning of the haystack,
// which is what ^ requires.
let expected = None;
assert_eq!(expected, got);
let dfa = DFA::builder()
.configure(DFA::config().anchored(true))
.build(r"a")?;
let mut cache = dfa.create_cache();
let got = dfa.find_leftmost_fwd_at(
&mut cache, None, None, haystack, 2, 3,
)?;
// An anchored search can still match anywhere in the haystack, it just
// must begin at the start of the search which is '2' in this case.
let expected = Some(HalfMatch::must(0, 3));
assert_eq!(expected, got);
let dfa = DFA::builder()
.configure(DFA::config().anchored(true))
.build(r"a")?;
let mut cache = dfa.create_cache();
let got = dfa.find_leftmost_fwd_at(
&mut cache, None, None, haystack, 1, 3,
)?;
// No match is found since we start searching at offset 1 which
// corresponds to 'b'. Since there is no '(?s:.)*?' prefix, no match
// is found.
let expected = None;
assert_eq!(expected, got);
let dfa = DFA::builder()
.configure(DFA::config().anchored(false))
.build(r"a")?;
let mut cache = dfa.create_cache();
let got = dfa.find_leftmost_fwd_at(
&mut cache, None, None, haystack, 1, 3,
)?;
// Since anchored=false, an implicit '(?s:.)*?' prefix was added to the
// pattern. Even though the search starts at 'b', the 'match anything'
// prefix allows the search to match 'a'.
let expected = Some(HalfMatch::must(0, 3));
assert_eq!(expected, got);
sourcepub fn match_kind(self, kind: MatchKind) -> Config
pub fn match_kind(self, kind: MatchKind) -> Config
Set the desired match semantics.
The default is MatchKind::LeftmostFirst
, which corresponds to the
match semantics of Perl-like regex engines. That is, when multiple
patterns would match at the same leftmost position, the pattern that
appears first in the concrete syntax is chosen.
Currently, the only other kind of match semantics supported is
MatchKind::All
. This corresponds to classical DFA construction
where all possible matches are added to the lazy DFA.
Typically, All
is used when one wants to execute an overlapping
search and LeftmostFirst
otherwise. In particular, it rarely makes
sense to use All
with the various “leftmost” find routines, since the
leftmost routines depend on the LeftmostFirst
automata construction
strategy. Specifically, LeftmostFirst
adds dead states to the
lazy DFA as a way to terminate the search and report a match.
LeftmostFirst
also supports non-greedy matches using this strategy
where as All
does not.
Example: overlapping search
This example shows the typical use of MatchKind::All
, which is to
report overlapping matches.
use regex_automata::{
hybrid::{dfa::DFA, OverlappingState},
HalfMatch, MatchKind,
};
let dfa = DFA::builder()
.configure(DFA::config().match_kind(MatchKind::All))
.build_many(&[r"\w+$", r"\S+$"])?;
let mut cache = dfa.create_cache();
let haystack = "@foo".as_bytes();
let mut state = OverlappingState::start();
let expected = Some(HalfMatch::must(1, 4));
let got = dfa.find_overlapping_fwd(&mut cache, haystack, &mut state)?;
assert_eq!(expected, got);
// The first pattern also matches at the same position, so re-running
// the search will yield another match. Notice also that the first
// pattern is returned after the second. This is because the second
// pattern begins its match before the first, is therefore an earlier
// match and is thus reported first.
let expected = Some(HalfMatch::must(0, 4));
let got = dfa.find_overlapping_fwd(&mut cache, haystack, &mut state)?;
assert_eq!(expected, got);
Example: reverse automaton to find start of match
Another example for using MatchKind::All
is for constructing a
reverse automaton to find the start of a match. All
semantics are
used for this in order to find the longest possible match, which
corresponds to the leftmost starting position.
Note that if you need the starting position then
hybrid::regex::Regex
will handle this
for you, so it’s usually not necessary to do this yourself.
use regex_automata::{hybrid::dfa::DFA, HalfMatch, MatchKind};
let haystack = "123foobar456".as_bytes();
let pattern = r"[a-z]+";
let dfa_fwd = DFA::new(pattern)?;
let dfa_rev = DFA::builder()
.configure(DFA::config()
.anchored(true)
.match_kind(MatchKind::All)
)
.build(pattern)?;
let mut cache_fwd = dfa_fwd.create_cache();
let mut cache_rev = dfa_rev.create_cache();
let expected_fwd = HalfMatch::must(0, 9);
let expected_rev = HalfMatch::must(0, 3);
let got_fwd = dfa_fwd.find_leftmost_fwd(
&mut cache_fwd, haystack,
)?.unwrap();
// Here we don't specify the pattern to search for since there's only
// one pattern and we're doing a leftmost search. But if this were an
// overlapping search, you'd need to specify the pattern that matched
// in the forward direction. (Otherwise, you might wind up finding the
// starting position of a match of some other pattern.) That in turn
// requires building the reverse automaton with starts_for_each_pattern
// enabled. Indeed, this is what Regex does internally.
let got_rev = dfa_rev.find_leftmost_rev_at(
&mut cache_rev, None, haystack, 0, got_fwd.offset(),
)?.unwrap();
assert_eq!(expected_fwd, got_fwd);
assert_eq!(expected_rev, got_rev);
sourcepub fn starts_for_each_pattern(self, yes: bool) -> Config
pub fn starts_for_each_pattern(self, yes: bool) -> Config
Whether to compile a separate start state for each pattern in the lazy DFA.
When enabled, a separate anchored start state is added for each pattern in the lazy DFA. When this start state is used, then the DFA will only search for matches for the pattern specified, even if there are other patterns in the DFA.
The main downside of this option is that it can potentially increase the size of the DFA and/or increase the time it takes to build the DFA at search time. However, since this is configuration for a lazy DFA, these states aren’t actually built unless they’re used. Enabling this isn’t necessarily free, however, as it may result in higher cache usage.
There are a few reasons one might want to enable this (it’s disabled by default):
- When looking for the start of an overlapping match (using a reverse
DFA), doing it correctly requires starting the reverse search using the
starting state of the pattern that matched in the forward direction.
Indeed, when building a
Regex
, it will automatically enable this option when building the reverse DFA internally. - When you want to use a DFA with multiple patterns to both search for matches of any pattern or to search for anchored matches of one particular pattern while using the same DFA. (Otherwise, you would need to compile a new DFA for each pattern.)
- Since the start states added for each pattern are anchored, if you
compile an unanchored DFA with one pattern while also enabling this
option, then you can use the same DFA to perform anchored or unanchored
searches. The latter you get with the standard search APIs. The former
you get from the various
_at
search methods that allow you specify a pattern ID to search for.
By default this is disabled.
Example
This example shows how to use this option to permit the same lazy DFA to run both anchored and unanchored searches for a single pattern.
use regex_automata::{hybrid::dfa::DFA, HalfMatch, PatternID};
let dfa = DFA::builder()
.configure(DFA::config().starts_for_each_pattern(true))
.build(r"foo[0-9]+")?;
let mut cache = dfa.create_cache();
let haystack = b"quux foo123";
// Here's a normal unanchored search. Notice that we use 'None' for the
// pattern ID. Since the DFA was built as an unanchored machine, it
// uses its default unanchored starting state.
let expected = HalfMatch::must(0, 11);
assert_eq!(Some(expected), dfa.find_leftmost_fwd_at(
&mut cache, None, None, haystack, 0, haystack.len(),
)?);
// But now if we explicitly specify the pattern to search ('0' being
// the only pattern in the DFA), then it will use the starting state
// for that specific pattern which is always anchored. Since the
// pattern doesn't have a match at the beginning of the haystack, we
// find nothing.
assert_eq!(None, dfa.find_leftmost_fwd_at(
&mut cache, None, Some(PatternID::must(0)), haystack, 0, haystack.len(),
)?);
// And finally, an anchored search is not the same as putting a '^' at
// beginning of the pattern. An anchored search can only match at the
// beginning of the *search*, which we can change:
assert_eq!(Some(expected), dfa.find_leftmost_fwd_at(
&mut cache, None, Some(PatternID::must(0)), haystack, 5, haystack.len(),
)?);
sourcepub fn byte_classes(self, yes: bool) -> Config
pub fn byte_classes(self, yes: bool) -> Config
Whether to attempt to shrink the size of the lazy DFA’s alphabet or not.
This option is enabled by default and should never be disabled unless one is debugging the lazy DFA.
When enabled, the lazy DFA will use a map from all possible bytes
to their corresponding equivalence class. Each equivalence class
represents a set of bytes that does not discriminate between a match
and a non-match in the DFA. For example, the pattern [ab]+
has at
least two equivalence classes: a set containing a
and b
and a set
containing every byte except for a
and b
. a
and b
are in the
same equivalence classes because they never discriminate between a
match and a non-match.
The advantage of this map is that the size of the transition table
can be reduced drastically from #states * 256 * sizeof(LazyStateID)
to #states * k * sizeof(LazyStateID)
where k
is the number of
equivalence classes (rounded up to the nearest power of 2). As a
result, total space usage can decrease substantially. Moreover, since a
smaller alphabet is used, DFA compilation during search becomes faster
as well since it will potentially be able to reuse a single transition
for multiple bytes.
WARNING: This is only useful for debugging lazy DFAs. Disabling this does not yield any speed advantages. Namely, even when this is disabled, a byte class map is still used while searching. The only difference is that every byte will be forced into its own distinct equivalence class. This is useful for debugging the actual generated transitions because it lets one see the transitions defined on actual bytes instead of the equivalence classes.
sourcepub fn unicode_word_boundary(self, yes: bool) -> Config
pub fn unicode_word_boundary(self, yes: bool) -> Config
Heuristically enable Unicode word boundaries.
When set, this will attempt to implement Unicode word boundaries as if
they were ASCII word boundaries. This only works when the search input
is ASCII only. If a non-ASCII byte is observed while searching, then a
MatchError::Quit
error is returned.
A possible alternative to enabling this option is to simply use an
ASCII word boundary, e.g., via (?-u:\b)
. The main reason to use this
option is if you absolutely need Unicode support. This option lets one
use a fast search implementation (a DFA) for some potentially very
common cases, while providing the option to fall back to some other
regex engine to handle the general case when an error is returned.
If the pattern provided has no Unicode word boundary in it, then this option has no effect. (That is, quitting on a non-ASCII byte only occurs when this option is enabled and a Unicode word boundary is present in the pattern.)
This is almost equivalent to setting all non-ASCII bytes to be quit bytes. The only difference is that this will cause non-ASCII bytes to be quit bytes only when a Unicode word boundary is present in the pattern.
When enabling this option, callers must be prepared to handle
a MatchError
error during search.
When using a Regex
, this
corresponds to using the try_
suite of methods. Alternatively,
if callers can guarantee that their input is ASCII only, then a
MatchError::Quit
error will never be
returned while searching.
This is disabled by default.
Example
This example shows how to heuristically enable Unicode word boundaries in a pattern. It also shows what happens when a search comes across a non-ASCII byte.
use regex_automata::{
hybrid::dfa::DFA,
HalfMatch, MatchError, MatchKind,
};
let dfa = DFA::builder()
.configure(DFA::config().unicode_word_boundary(true))
.build(r"\b[0-9]+\b")?;
let mut cache = dfa.create_cache();
// The match occurs before the search ever observes the snowman
// character, so no error occurs.
let haystack = "foo 123 ☃".as_bytes();
let expected = Some(HalfMatch::must(0, 7));
let got = dfa.find_leftmost_fwd(&mut cache, haystack)?;
assert_eq!(expected, got);
// Notice that this search fails, even though the snowman character
// occurs after the ending match offset. This is because search
// routines read one byte past the end of the search to account for
// look-around, and indeed, this is required here to determine whether
// the trailing \b matches.
let haystack = "foo 123☃".as_bytes();
let expected = MatchError::Quit { byte: 0xE2, offset: 7 };
let got = dfa.find_leftmost_fwd(&mut cache, haystack);
assert_eq!(Err(expected), got);
sourcepub fn quit(self, byte: u8, yes: bool) -> Config
pub fn quit(self, byte: u8, yes: bool) -> Config
Add a “quit” byte to the lazy DFA.
When a quit byte is seen during search time, then search will return
a MatchError::Quit
error indicating the
offset at which the search stopped.
A quit byte will always overrule any other aspects of a regex. For
example, if the x
byte is added as a quit byte and the regex \w
is
used, then observing x
will cause the search to quit immediately
despite the fact that x
is in the \w
class.
This mechanism is primarily useful for heuristically enabling certain
features like Unicode word boundaries in a DFA. Namely, if the input
to search is ASCII, then a Unicode word boundary can be implemented
via an ASCII word boundary with no change in semantics. Thus, a DFA
can attempt to match a Unicode word boundary but give up as soon as it
observes a non-ASCII byte. Indeed, if callers set all non-ASCII bytes
to be quit bytes, then Unicode word boundaries will be permitted when
building lazy DFAs. Of course, callers should enable
Config::unicode_word_boundary
if they want this behavior instead.
(The advantage being that non-ASCII quit bytes will only be added if a
Unicode word boundary is in the pattern.)
When enabling this option, callers must be prepared to handle a
MatchError
error during search. When using a
Regex
, this corresponds to using the
try_
suite of methods.
By default, there are no quit bytes set.
Panics
This panics if heuristic Unicode word boundaries are enabled and any non-ASCII byte is removed from the set of quit bytes. Namely, enabling Unicode word boundaries requires setting every non-ASCII byte to a quit byte. So if the caller attempts to undo any of that, then this will panic.
Example
This example shows how to cause a search to terminate if it sees a
\n
byte. This could be useful if, for example, you wanted to prevent
a user supplied pattern from matching across a line boundary.
use regex_automata::{hybrid::dfa::DFA, HalfMatch, MatchError};
let dfa = DFA::builder()
.configure(DFA::config().quit(b'\n', true))
.build(r"foo\p{any}+bar")?;
let mut cache = dfa.create_cache();
let haystack = "foo\nbar".as_bytes();
// Normally this would produce a match, since \p{any} contains '\n'.
// But since we instructed the automaton to enter a quit state if a
// '\n' is observed, this produces a match error instead.
let expected = MatchError::Quit { byte: 0x0A, offset: 3 };
let got = dfa.find_leftmost_fwd(&mut cache, haystack).unwrap_err();
assert_eq!(expected, got);
sourcepub fn cache_capacity(self, bytes: usize) -> Config
pub fn cache_capacity(self, bytes: usize) -> Config
Sets the maximum amount of heap memory, in bytes, to allocate to the cache for use during a lazy DFA search. If the lazy DFA would otherwise use more heap memory, then, depending on other configuration knobs, either stop the search and return an error or clear the cache and continue the search.
The default cache capacity is some “reasonable” number that will accommodate most regular expressions. You may find that if you need to build a large DFA then it may be necessary to increase the cache capacity.
Note that while building a lazy DFA will do a “minimum” check to ensure the capacity is big enough, this is more or less about correctness. If the cache is bigger than the minimum but still too small, then the lazy DFA could wind up spending a lot of time clearing the cache and recomputing transitions, thus negating the performance benefits of a lazy DFA. Thus, setting the cache capacity is mostly an experimental endeavor. For most common patterns, however, the default should be sufficient.
For more details on how the lazy DFA’s cache is used, see the
documentation for Cache
.
Example
This example shows what happens if the configured cache capacity is too small. In such cases, one can override the cache capacity to make it bigger. Alternatively, one might want to use less memory by setting a smaller cache capacity.
use regex_automata::{hybrid::dfa::DFA, HalfMatch, MatchError};
let pattern = r"\p{L}{1000}";
// The default cache capacity is likely too small to deal with regexes
// that are very large. Large repetitions of large Unicode character
// classes are a common way to make very large regexes.
let _ = DFA::new(pattern).unwrap_err();
// Bump up the capacity to something bigger.
let dfa = DFA::builder()
.configure(DFA::config().cache_capacity(100 * (1<<20))) // 100 MB
.build(pattern)?;
let mut cache = dfa.create_cache();
let haystack = "ͰͲͶͿΆΈΉΊΌΎΏΑΒΓΔΕΖΗΘΙ".repeat(50);
let expected = Some(HalfMatch::must(0, 2000));
let got = dfa.find_leftmost_fwd(&mut cache, haystack.as_bytes())?;
assert_eq!(expected, got);
sourcepub fn skip_cache_capacity_check(self, yes: bool) -> Config
pub fn skip_cache_capacity_check(self, yes: bool) -> Config
Configures construction of a lazy DFA to use the minimum cache capacity if the configured capacity is otherwise too small for the provided NFA.
This is useful if you never want lazy DFA construction to fail because of a capacity that is too small.
In general, this option is typically not a good idea. In particular, while a minimum cache capacity does permit the lazy DFA to function where it otherwise couldn’t, it’s plausible that it may not function well if it’s constantly running out of room. In that case, the speed advantages of the lazy DFA may be negated.
This is disabled by default.
Example
This example shows what happens if the configured cache capacity is too small. In such cases, one could override the capacity explicitly. An alternative, demonstrated here, let’s us force construction to use the minimum cache capacity if the configured capacity is otherwise too small.
use regex_automata::{hybrid::dfa::DFA, HalfMatch, MatchError};
let pattern = r"\p{L}{1000}";
// The default cache capacity is likely too small to deal with regexes
// that are very large. Large repetitions of large Unicode character
// classes are a common way to make very large regexes.
let _ = DFA::new(pattern).unwrap_err();
// Configure construction such it automatically selects the minimum
// cache capacity if it would otherwise be too small.
let dfa = DFA::builder()
.configure(DFA::config().skip_cache_capacity_check(true))
.build(pattern)?;
let mut cache = dfa.create_cache();
let haystack = "ͰͲͶͿΆΈΉΊΌΎΏΑΒΓΔΕΖΗΘΙ".repeat(50);
let expected = Some(HalfMatch::must(0, 2000));
let got = dfa.find_leftmost_fwd(&mut cache, haystack.as_bytes())?;
assert_eq!(expected, got);
sourcepub fn minimum_cache_clear_count(self, min: Option<usize>) -> Config
pub fn minimum_cache_clear_count(self, min: Option<usize>) -> Config
Configure a lazy DFA search to quit after a certain number of cache clearings.
When a minimum is set, then a lazy DFA search will “give up” after the minimum number of cache clearings has occurred. This is typically useful in scenarios where callers want to detect whether the lazy DFA search is “efficient” or not. If the cache is cleared too many times, this is a good indicator that it is not efficient, and thus, the caller may wish to use some other regex engine.
Note that the number of times a cache is cleared is a property of
the cache itself. Thus, if a cache is used in a subsequent search
with a similarly configured lazy DFA, then it would cause the
search to “give up” if the cache needed to be cleared. The cache
clear count can only be reset to 0
via DFA::reset_cache
(or
Regex::reset_cache
if
you’re using the Regex
API).
By default, no minimum is configured. Thus, a lazy DFA search will never give up due to cache clearings.
Example
This example uses a somewhat pathological configuration to demonstrate the possible behavior of cache clearing and how it might result in a search that returns an error.
It is important to note that the precise mechanics of how and when a cache gets cleared is an implementation detail. Thus, the asserts in the tests below with respect to the particular offsets at which a search gave up should be viewed strictly as a demonstration. They are not part of any API guarantees offered by this crate.
use regex_automata::{hybrid::dfa::DFA, MatchError};
// This is a carefully chosen regex. The idea is to pick one
// that requires some decent number of states (hence the bounded
// repetition). But we specifically choose to create a class with an
// ASCII letter and a non-ASCII letter so that we can check that no new
// states are created once the cache is full. Namely, if we fill up the
// cache on a haystack of 'a's, then in order to match one 'β', a new
// state will need to be created since a 'β' is encoded with multiple
// bytes. Since there's no room for this state, the search should quit
// at the very first position.
let pattern = r"[aβ]{100}";
let dfa = DFA::builder()
.configure(
// Configure it so that we have the minimum cache capacity
// possible. And that if any clearings occur, the search quits.
DFA::config()
.skip_cache_capacity_check(true)
.cache_capacity(0)
.minimum_cache_clear_count(Some(0)),
)
.build(pattern)?;
let mut cache = dfa.create_cache();
let haystack = "a".repeat(101).into_bytes();
assert_eq!(
dfa.find_leftmost_fwd(&mut cache, &haystack),
Err(MatchError::GaveUp { offset: 25 }),
);
// Now that we know the cache is full, if we search a haystack that we
// know will require creating at least one new state, it should not
// be able to make any progress.
let haystack = "β".repeat(101).into_bytes();
assert_eq!(
dfa.find_leftmost_fwd(&mut cache, &haystack),
Err(MatchError::GaveUp { offset: 0 }),
);
// If we reset the cache, then we should be able to create more states
// and make more progress with searching for betas.
cache.reset(&dfa);
let haystack = "β".repeat(101).into_bytes();
assert_eq!(
dfa.find_earliest_fwd(&mut cache, &haystack),
Err(MatchError::GaveUp { offset: 26 }),
);
// ... switching back to ASCII still makes progress since it just needs
// to set transitions on existing states!
let haystack = "a".repeat(101).into_bytes();
assert_eq!(
dfa.find_earliest_fwd(&mut cache, &haystack),
Err(MatchError::GaveUp { offset: 13 }),
);
sourcepub fn get_anchored(&self) -> bool
pub fn get_anchored(&self) -> bool
Returns whether this configuration has enabled anchored searches.
sourcepub fn get_match_kind(&self) -> MatchKind
pub fn get_match_kind(&self) -> MatchKind
Returns the match semantics set in this configuration.
sourcepub fn get_starts_for_each_pattern(&self) -> bool
pub fn get_starts_for_each_pattern(&self) -> bool
Returns whether this configuration has enabled anchored starting states for every pattern in the DFA.
sourcepub fn get_byte_classes(&self) -> bool
pub fn get_byte_classes(&self) -> bool
Returns whether this configuration has enabled byte classes or not. This is typically a debugging oriented option, as disabling it confers no speed benefit.
sourcepub fn get_unicode_word_boundary(&self) -> bool
pub fn get_unicode_word_boundary(&self) -> bool
Returns whether this configuration has enabled heuristic Unicode word boundary support. When enabled, it is possible for a search to return an error.
sourcepub fn get_quit(&self, byte: u8) -> bool
pub fn get_quit(&self, byte: u8) -> bool
Returns whether this configuration will instruct the DFA to enter a quit state whenever the given byte is seen during a search. When at least one byte has this enabled, it is possible for a search to return an error.
sourcepub fn get_cache_capacity(&self) -> usize
pub fn get_cache_capacity(&self) -> usize
Returns the cache capacity set on this configuration.
sourcepub fn get_skip_cache_capacity_check(&self) -> bool
pub fn get_skip_cache_capacity_check(&self) -> bool
Returns whether the cache capacity check should be skipped.
sourcepub fn get_minimum_cache_clear_count(&self) -> Option<usize>
pub fn get_minimum_cache_clear_count(&self) -> Option<usize>
Returns, if set, the minimum number of times the cache must be cleared before a lazy DFA search can give up. When no minimum is set, then a search will never quit and will always clear the cache whenever it fills up.
sourcepub fn get_minimum_cache_capacity(&self, nfa: &NFA) -> Result<usize, BuildError>
pub fn get_minimum_cache_capacity(&self, nfa: &NFA) -> Result<usize, BuildError>
Returns the minimum lazy DFA cache capacity required for the given NFA.
The cache capacity required for a particular NFA may change without notice. Callers should not rely on it being stable.
This is useful for informational purposes, but can also be useful for other reasons. For example, if one wants to check the minimum cache capacity themselves or if one wants to set the capacity based on the minimum.
This may return an error if this configuration does not support all of the instructions used in the given NFA. For example, if the NFA has a Unicode word boundary but this configuration does not enable heuristic support for Unicode word boundaries.
Trait Implementations
impl Copy for Config
Auto Trait Implementations
impl RefUnwindSafe for Config
impl Send for Config
impl Sync for Config
impl Unpin for Config
impl UnwindSafe for Config
Blanket Implementations
sourceimpl<T> BorrowMut<T> for T where
T: ?Sized,
impl<T> BorrowMut<T> for T where
T: ?Sized,
const: unstable · sourcefn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more
sourceimpl<T> ToOwned for T where
T: Clone,
impl<T> ToOwned for T where
T: Clone,
type Owned = T
type Owned = T
The resulting type after obtaining ownership.
sourcefn clone_into(&self, target: &mut T)
fn clone_into(&self, target: &mut T)
toowned_clone_into
)Uses borrowed data to replace owned data, usually by cloning. Read more