pub struct Regex { /* private fields */ }
Expand description

A regular expression that uses hybrid NFA/DFAs (also called “lazy DFAs”) for searching.

A regular expression is comprised of two lazy DFAs, a “forward” DFA and a “reverse” DFA. The forward DFA is responsible for detecting the end of a match while the reverse DFA is responsible for detecting the start of a match. Thus, in order to find the bounds of any given match, a forward search must first be run followed by a reverse search. A match found by the forward DFA guarantees that the reverse DFA will also find a match.

A Regex can also have a prefilter set via the set_prefilter method. By default, no prefilter is enabled.

Earliest vs Leftmost vs Overlapping

The search routines exposed on a Regex reflect three different ways of searching:

  • “earliest” means to stop as soon as a match has been detected.
  • “leftmost” means to continue matching until the underlying automaton cannot advance. This reflects “standard” searching you might be used to in other regex engines. e.g., This permits non-greedy and greedy searching to work as you would expect.
  • “overlapping” means to find all possible matches, even if they overlap.

Generally speaking, when doing an overlapping search, you’ll want to build your regex lazy DFAs with MatchKind::All semantics. Using MatchKind::LeftmostFirst semantics with overlapping searches is likely to lead to odd behavior since LeftmostFirst specifically omits some matches that can never be reported due to its semantics.

The following example shows the differences between how these different types of searches impact looking for matches of [a-z]+ in the haystack abc.

use regex_automata::{hybrid::{dfa, regex}, MatchKind, MultiMatch};

let pattern = r"[a-z]+";
let haystack = "abc".as_bytes();

// With leftmost-first semantics, we test "earliest" and "leftmost".
let re = regex::Builder::new()
    .dfa(dfa::Config::new().match_kind(MatchKind::LeftmostFirst))
    .build(pattern)?;
let mut cache = re.create_cache();

// "earliest" searching isn't impacted by greediness
let mut it = re.find_earliest_iter(&mut cache, haystack);
assert_eq!(Some(MultiMatch::must(0, 0, 1)), it.next());
assert_eq!(Some(MultiMatch::must(0, 1, 2)), it.next());
assert_eq!(Some(MultiMatch::must(0, 2, 3)), it.next());
assert_eq!(None, it.next());

// "leftmost" searching supports greediness (and non-greediness)
let mut it = re.find_leftmost_iter(&mut cache, haystack);
assert_eq!(Some(MultiMatch::must(0, 0, 3)), it.next());
assert_eq!(None, it.next());

// For overlapping, we want "all" match kind semantics.
let re = regex::Builder::new()
    .dfa(dfa::Config::new().match_kind(MatchKind::All))
    .build(pattern)?;
let mut cache = re.create_cache();

// In the overlapping search, we find all three possible matches
// starting at the beginning of the haystack.
let mut it = re.find_overlapping_iter(&mut cache, haystack);
assert_eq!(Some(MultiMatch::must(0, 0, 1)), it.next());
assert_eq!(Some(MultiMatch::must(0, 0, 2)), it.next());
assert_eq!(Some(MultiMatch::must(0, 0, 3)), it.next());
assert_eq!(None, it.next());

Fallibility

In non-default configurations, the lazy DFAs generated in this module may return an error during a search. (Currently, the only way this happens is if quit bytes are added, Unicode word boundaries are heuristically enabled, or if the cache is configured to “give up” on a search if it has been cleared too many times. All of these are turned off by default, which means a search can never fail in the default configuration.) For convenience, the main search routines, like find_leftmost, will panic if an error occurs. However, if you need to use DFAs which may produce an error at search time, then there are fallible equivalents of all search routines. For example, for find_leftmost, its fallible analog is try_find_leftmost. The routines prefixed with try_ return Result<Option<MultiMatch>, MatchError>, where as the infallible routines simply return Option<MultiMatch>.

Example

This example shows how to cause a search to terminate if it sees a \n byte, and handle the error returned. This could be useful if, for example, you wanted to prevent a user supplied pattern from matching across a line boundary.

use regex_automata::{hybrid::{dfa, regex::Regex}, MatchError};

let re = Regex::builder()
    .dfa(dfa::Config::new().quit(b'\n', true))
    .build(r"foo\p{any}+bar")?;
let mut cache = re.create_cache();

let haystack = "foo\nbar".as_bytes();
// Normally this would produce a match, since \p{any} contains '\n'.
// But since we instructed the automaton to enter a quit state if a
// '\n' is observed, this produces a match error instead.
let expected = MatchError::Quit { byte: 0x0A, offset: 3 };
let got = re.try_find_leftmost(&mut cache, haystack).unwrap_err();
assert_eq!(expected, got);

Implementations

Convenience routines for regex and cache construction.

Parse the given regular expression using the default configuration and return the corresponding regex.

If you want a non-default configuration, then use the Builder to set your own configuration.

Example
use regex_automata::{MultiMatch, hybrid::regex::Regex};

let re = Regex::new("foo[0-9]+bar")?;
let mut cache = re.create_cache();
assert_eq!(
    Some(MultiMatch::must(0, 3, 14)),
    re.find_leftmost(&mut cache, b"zzzfoo12345barzzz"),
);

Like new, but parses multiple patterns into a single “regex set.” This similarly uses the default regex configuration.

Example
use regex_automata::{MultiMatch, hybrid::regex::Regex};

let re = Regex::new_many(&["[a-z]+", "[0-9]+"])?;
let mut cache = re.create_cache();

let mut it = re.find_leftmost_iter(
    &mut cache,
    b"abc 1 foo 4567 0 quux",
);
assert_eq!(Some(MultiMatch::must(0, 0, 3)), it.next());
assert_eq!(Some(MultiMatch::must(1, 4, 5)), it.next());
assert_eq!(Some(MultiMatch::must(0, 6, 9)), it.next());
assert_eq!(Some(MultiMatch::must(1, 10, 14)), it.next());
assert_eq!(Some(MultiMatch::must(1, 15, 16)), it.next());
assert_eq!(Some(MultiMatch::must(0, 17, 21)), it.next());
assert_eq!(None, it.next());

Return a default configuration for a Regex.

This is a convenience routine to avoid needing to import the Config type when customizing the construction of a regex.

Example

This example shows how to disable UTF-8 mode for Regex iteration. When UTF-8 mode is disabled, the position immediately following an empty match is where the next search begins, instead of the next position of a UTF-8 encoded codepoint.

use regex_automata::{hybrid::regex::Regex, MultiMatch};

let re = Regex::builder()
    .configure(Regex::config().utf8(false))
    .build(r"")?;
let mut cache = re.create_cache();

let haystack = "a☃z".as_bytes();
let mut it = re.find_leftmost_iter(&mut cache, haystack);
assert_eq!(Some(MultiMatch::must(0, 0, 0)), it.next());
assert_eq!(Some(MultiMatch::must(0, 1, 1)), it.next());
assert_eq!(Some(MultiMatch::must(0, 2, 2)), it.next());
assert_eq!(Some(MultiMatch::must(0, 3, 3)), it.next());
assert_eq!(Some(MultiMatch::must(0, 4, 4)), it.next());
assert_eq!(Some(MultiMatch::must(0, 5, 5)), it.next());
assert_eq!(None, it.next());

Return a builder for configuring the construction of a Regex.

This is a convenience routine to avoid needing to import the Builder type in common cases.

Example

This example shows how to use the builder to disable UTF-8 mode everywhere.

use regex_automata::{
    hybrid::regex::Regex,
    nfa::thompson,
    MultiMatch, SyntaxConfig,
};

let re = Regex::builder()
    .configure(Regex::config().utf8(false))
    .syntax(SyntaxConfig::new().utf8(false))
    .thompson(thompson::Config::new().utf8(false))
    .build(r"foo(?-u:[^b])ar.*")?;
let mut cache = re.create_cache();

let haystack = b"\xFEfoo\xFFarzz\xE2\x98\xFF\n";
let expected = Some(MultiMatch::must(0, 1, 9));
let got = re.find_leftmost(&mut cache, haystack);
assert_eq!(expected, got);

Create a new cache for this Regex.

The cache returned should only be used for searches for this Regex. If you want to reuse the cache for another Regex, then you must call Cache::reset with that Regex (or, equivalently, Regex::reset_cache).

Reset the given cache such that it can be used for searching with the this Regex (and only this Regex).

A cache reset permits reusing memory already allocated in this cache with a different Regex.

Resetting a cache sets its “clear count” to 0. This is relevant if the Regex has been configured to “give up” after it has cleared the cache a certain number of times.

Example

This shows how to re-purpose a cache for use with a different Regex.

use regex_automata::{hybrid::regex::Regex, MultiMatch};

let re1 = Regex::new(r"\w")?;
let re2 = Regex::new(r"\W")?;

let mut cache = re1.create_cache();
assert_eq!(
    Some(MultiMatch::must(0, 0, 2)),
    re1.find_leftmost(&mut cache, "Δ".as_bytes()),
);

// Using 'cache' with re2 is not allowed. It may result in panics or
// incorrect results. In order to re-purpose the cache, we must reset
// it with the Regex we'd like to use it with.
//
// Similarly, after this reset, using the cache with 're1' is also not
// allowed.
re2.reset_cache(&mut cache);
assert_eq!(
    Some(MultiMatch::must(0, 0, 3)),
    re2.find_leftmost(&mut cache, "☃".as_bytes()),
);

Standard infallible search routines for finding and iterating over matches.

Returns true if and only if this regex matches the given haystack.

This routine may short circuit if it knows that scanning future input will never lead to a different result. In particular, if the underlying DFA enters a match state or a dead state, then this routine will return true or false, respectively, without inspecting any future input.

Panics

If the underlying lazy DFAs return an error, then this routine panics. This only occurs in non-default configurations where quit bytes are used, Unicode word boundaries are heuristically enabled or limits are set on the number of times the lazy DFA’s cache may be cleared.

The fallible version of this routine is try_is_match.

Example
use regex_automata::hybrid::regex::Regex;

let re = Regex::new("foo[0-9]+bar")?;
let mut cache = re.create_cache();

assert_eq!(true, re.is_match(&mut cache, b"foo12345bar"));
assert_eq!(false, re.is_match(&mut cache, b"foobar"));

Returns the first position at which a match is found.

This routine stops scanning input in precisely the same circumstances as is_match. The key difference is that this routine returns the position at which it stopped scanning input if and only if a match was found. If no match is found, then None is returned.

Panics

If the underlying lazy DFAs return an error, then this routine panics. This only occurs in non-default configurations where quit bytes are used, Unicode word boundaries are heuristically enabled or limits are set on the number of times the lazy DFA’s cache may be cleared.

The fallible version of this routine is try_find_earliest.

Example
use regex_automata::{MultiMatch, hybrid::regex::Regex};

// Normally, the leftmost first match would greedily consume as many
// decimal digits as it could. But a match is detected as soon as one
// digit is seen.
let re = Regex::new("foo[0-9]+")?;
let mut cache = re.create_cache();
assert_eq!(
    Some(MultiMatch::must(0, 0, 4)),
    re.find_earliest(&mut cache, b"foo12345"),
);

// Normally, the end of the leftmost first match here would be 3,
// but the "earliest" match semantics detect a match earlier.
let re = Regex::new("abc|a")?;
let mut cache = re.create_cache();
assert_eq!(
    Some(MultiMatch::must(0, 0, 1)),
    re.find_earliest(&mut cache, b"abc"),
);

Returns the start and end offset of the leftmost match. If no match exists, then None is returned.

Panics

If the underlying lazy DFAs return an error, then this routine panics. This only occurs in non-default configurations where quit bytes are used, Unicode word boundaries are heuristically enabled or limits are set on the number of times the lazy DFA’s cache may be cleared.

The fallible version of this routine is try_find_leftmost.

Example
use regex_automata::{MultiMatch, hybrid::regex::Regex};

// Greediness is applied appropriately when compared to find_earliest.
let re = Regex::new("foo[0-9]+")?;
let mut cache = re.create_cache();
assert_eq!(
    Some(MultiMatch::must(0, 3, 11)),
    re.find_leftmost(&mut cache, b"zzzfoo12345zzz"),
);

// Even though a match is found after reading the first byte (`a`),
// the default leftmost-first match semantics demand that we find the
// earliest match that prefers earlier parts of the pattern over latter
// parts.
let re = Regex::new("abc|a")?;
let mut cache = re.create_cache();
assert_eq!(
    Some(MultiMatch::must(0, 0, 3)),
    re.find_leftmost(&mut cache, b"abc"),
);

Search for the first overlapping match in haystack.

This routine is principally useful when searching for multiple patterns on inputs where multiple patterns may match the same regions of text. In particular, callers must preserve the automaton’s search state from prior calls so that the implementation knows where the last match occurred and which pattern was reported.

Panics

If the underlying lazy DFAs return an error, then this routine panics. This only occurs in non-default configurations where quit bytes are used, Unicode word boundaries are heuristically enabled or limits are set on the number of times the lazy DFA’s cache may be cleared.

The fallible version of this routine is try_find_overlapping.

Example

This example shows how to run an overlapping search with multiple regexes.

use regex_automata::{
    hybrid::{dfa::DFA, regex::Regex, OverlappingState},
    MatchKind,
    MultiMatch,
};

let re = Regex::builder()
    .dfa(DFA::config().match_kind(MatchKind::All))
    .build_many(&[r"\w+$", r"\S+$"])?;
let mut cache = re.create_cache();

let haystack = "@foo".as_bytes();
let mut state = OverlappingState::start();

let expected = Some(MultiMatch::must(1, 0, 4));
let got = re.find_overlapping(&mut cache, haystack, &mut state);
assert_eq!(expected, got);

// The first pattern also matches at the same position, so re-running
// the search will yield another match. Notice also that the first
// pattern is returned after the second. This is because the second
// pattern begins its match before the first, is therefore an earlier
// match and is thus reported first.
let expected = Some(MultiMatch::must(0, 1, 4));
let got = re.find_overlapping(&mut cache, haystack, &mut state);
assert_eq!(expected, got);

Returns an iterator over all non-overlapping “earliest” matches.

Match positions are reported as soon as a match is known to occur, even if the standard leftmost match would be longer.

Panics

If the underlying lazy DFAs return an error, then this routine panics. This only occurs in non-default configurations where quit bytes are used, Unicode word boundaries are heuristically enabled or limits are set on the number of times the lazy DFA’s cache may be cleared.

The fallible version of this routine is try_find_earliest_iter.

Example

This example shows how to run an “earliest” iterator.

use regex_automata::{hybrid::regex::Regex, MultiMatch};

let re = Regex::new("[0-9]+")?;
let mut cache = re.create_cache();
let haystack = "123".as_bytes();

// Normally, a standard leftmost iterator would return a single
// match, but since "earliest" detects matches earlier, we get
// three matches.
let mut it = re.find_earliest_iter(&mut cache, haystack);
assert_eq!(Some(MultiMatch::must(0, 0, 1)), it.next());
assert_eq!(Some(MultiMatch::must(0, 1, 2)), it.next());
assert_eq!(Some(MultiMatch::must(0, 2, 3)), it.next());
assert_eq!(None, it.next());

Returns an iterator over all non-overlapping leftmost matches in the given bytes. If no match exists, then the iterator yields no elements.

This corresponds to the “standard” regex search iterator.

Panics

If the underlying lazy DFAs return an error, then this routine panics. This only occurs in non-default configurations where quit bytes are used, Unicode word boundaries are heuristically enabled or limits are set on the number of times the lazy DFA’s cache may be cleared.

The fallible version of this routine is try_find_leftmost_iter.

Example
use regex_automata::{MultiMatch, hybrid::regex::Regex};

let re = Regex::new("foo[0-9]+")?;
let mut cache = re.create_cache();

let text = b"foo1 foo12 foo123";
let matches: Vec<MultiMatch> = re
    .find_leftmost_iter(&mut cache, text)
    .collect();
assert_eq!(matches, vec![
    MultiMatch::must(0, 0, 4),
    MultiMatch::must(0, 5, 10),
    MultiMatch::must(0, 11, 17),
]);

Returns an iterator over all overlapping matches in the given haystack.

This routine is principally useful when searching for multiple patterns on inputs where multiple patterns may match the same regions of text. The iterator takes care of handling the overlapping state that must be threaded through every search.

Panics

If the underlying lazy DFAs return an error, then this routine panics. This only occurs in non-default configurations where quit bytes are used, Unicode word boundaries are heuristically enabled or limits are set on the number of times the lazy DFA’s cache may be cleared.

The fallible version of this routine is try_find_overlapping_iter.

Example

This example shows how to run an overlapping search with multiple regexes.

use regex_automata::{
    hybrid::{dfa::DFA, regex::Regex},
    MatchKind,
    MultiMatch,
};

let re = Regex::builder()
    .dfa(DFA::config().match_kind(MatchKind::All))
    .build_many(&[r"\w+$", r"\S+$"])?;
let mut cache = re.create_cache();
let haystack = "@foo".as_bytes();

let mut it = re.find_overlapping_iter(&mut cache, haystack);
assert_eq!(Some(MultiMatch::must(1, 0, 4)), it.next());
assert_eq!(Some(MultiMatch::must(0, 1, 4)), it.next());
assert_eq!(None, it.next());

Lower level infallible search routines that permit controlling where the search starts and ends in a particular sequence. This is useful for executing searches that need to take surrounding context into account. This is required for correctly implementing iteration because of look-around operators (^, $, \b).

Returns true if and only if this regex matches the given haystack.

This routine may short circuit if it knows that scanning future input will never lead to a different result. In particular, if the underlying DFA enters a match state or a dead state, then this routine will return true or false, respectively, without inspecting any future input.

Searching a substring of the haystack

Being an “at” search routine, this permits callers to search a substring of haystack by specifying a range in haystack. Why expose this as an API instead of just asking callers to use &input[start..end]? The reason is that regex matching often wants to take the surrounding context into account in order to handle look-around (^, $ and \b).

Panics

If the underlying lazy DFAs return an error, then this routine panics. This only occurs in non-default configurations where quit bytes are used, Unicode word boundaries are heuristically enabled or limits are set on the number of times the lazy DFA’s cache may be cleared.

The fallible version of this routine is try_is_match_at.

Returns the first position at which a match is found.

This routine stops scanning input in precisely the same circumstances as is_match. The key difference is that this routine returns the position at which it stopped scanning input if and only if a match was found. If no match is found, then None is returned.

Searching a substring of the haystack

Being an “at” search routine, this permits callers to search a substring of haystack by specifying a range in haystack. Why expose this as an API instead of just asking callers to use &input[start..end]? The reason is that regex matching often wants to take the surrounding context into account in order to handle look-around (^, $ and \b).

This is useful when implementing an iterator over matches within the same haystack, which cannot be done correctly by simply providing a subslice of haystack.

Panics

If the underlying lazy DFAs return an error, then this routine panics. This only occurs in non-default configurations where quit bytes are used, Unicode word boundaries are heuristically enabled or limits are set on the number of times the lazy DFA’s cache may be cleared.

The fallible version of this routine is try_find_earliest_at.

Returns the same as find_leftmost, but starts the search at the given offset.

Searching a substring of the haystack

Being an “at” search routine, this permits callers to search a substring of haystack by specifying a range in haystack. Why expose this as an API instead of just asking callers to use &input[start..end]? The reason is that regex matching often wants to take the surrounding context into account in order to handle look-around (^, $ and \b).

This is useful when implementing an iterator over matches within the same haystack, which cannot be done correctly by simply providing a subslice of haystack.

Panics

If the underlying lazy DFAs return an error, then this routine panics. This only occurs in non-default configurations where quit bytes are used, Unicode word boundaries are heuristically enabled or limits are set on the number of times the lazy DFA’s cache may be cleared.

The fallible version of this routine is try_find_leftmost_at.

Search for the first overlapping match within a given range of haystack.

This routine is principally useful when searching for multiple patterns on inputs where multiple patterns may match the same regions of text. In particular, callers must preserve the automaton’s search state from prior calls so that the implementation knows where the last match occurred and which pattern was reported.

Searching a substring of the haystack

Being an “at” search routine, this permits callers to search a substring of haystack by specifying a range in haystack. Why expose this as an API instead of just asking callers to use &input[start..end]? The reason is that regex matching often wants to take the surrounding context into account in order to handle look-around (^, $ and \b).

This is useful when implementing an iterator over matches within the same haystack, which cannot be done correctly by simply providing a subslice of haystack.

Panics

If the underlying lazy DFAs return an error, then this routine panics. This only occurs in non-default configurations where quit bytes are used, Unicode word boundaries are heuristically enabled or limits are set on the number of times the lazy DFA’s cache may be cleared.

The fallible version of this routine is try_find_overlapping_at.

Fallible search routines. These may return an error when the underlying lazy DFAs have been configured in a way that permits them to fail during a search.

Errors during search only occur when the lazy DFA has been explicitly configured to do so, usually by specifying one or more “quit” bytes or by heuristically enabling Unicode word boundaries.

Errors will never be returned using the default configuration. So these fallible routines are only needed for particular configurations.

Returns true if and only if this regex matches the given haystack.

This routine may short circuit if it knows that scanning future input will never lead to a different result. In particular, if the underlying DFA enters a match state or a dead state, then this routine will return true or false, respectively, without inspecting any future input.

Errors

This routine only errors if the search could not complete. For DFA-based regexes, this only occurs in a non-default configuration where quit bytes are used, Unicode word boundaries are heuristically enabled or limits are set on the number of times the lazy DFA’s cache may be cleared.

When a search cannot complete, callers cannot know whether a match exists or not.

The infallible (panics on error) version of this routine is is_match.

Returns the first position at which a match is found.

This routine stops scanning input in precisely the same circumstances as is_match. The key difference is that this routine returns the position at which it stopped scanning input if and only if a match was found. If no match is found, then None is returned.

Errors

This routine only errors if the search could not complete. For DFA-based regexes, this only occurs in a non-default configuration where quit bytes are used, Unicode word boundaries are heuristically enabled or limits are set on the number of times the lazy DFA’s cache may be cleared.

When a search cannot complete, callers cannot know whether a match exists or not.

The infallible (panics on error) version of this routine is find_earliest.

Returns the start and end offset of the leftmost match. If no match exists, then None is returned.

Errors

This routine only errors if the search could not complete. For DFA-based regexes, this only occurs in a non-default configuration where quit bytes are used, Unicode word boundaries are heuristically enabled or limits are set on the number of times the lazy DFA’s cache may be cleared.

When a search cannot complete, callers cannot know whether a match exists or not.

The infallible (panics on error) version of this routine is find_leftmost.

Search for the first overlapping match in haystack.

This routine is principally useful when searching for multiple patterns on inputs where multiple patterns may match the same regions of text. In particular, callers must preserve the automaton’s search state from prior calls so that the implementation knows where the last match occurred and which pattern was reported.

Errors

This routine only errors if the search could not complete. For DFA-based regexes, this only occurs in a non-default configuration where quit bytes are used, Unicode word boundaries are heuristically enabled or limits are set on the number of times the lazy DFA’s cache may be cleared.

When a search cannot complete, callers cannot know whether a match exists or not.

The infallible (panics on error) version of this routine is find_overlapping.

Returns an iterator over all non-overlapping “earliest” matches.

Match positions are reported as soon as a match is known to occur, even if the standard leftmost match would be longer.

Errors

This iterator only yields errors if the search could not complete. For DFA-based regexes, this only occurs in a non-default configuration where quit bytes are used, Unicode word boundaries are heuristically enabled or limits are set on the number of times the lazy DFA’s cache may be cleared.

When a search cannot complete, callers cannot know whether a match exists or not.

The infallible (panics on error) version of this routine is find_earliest_iter.

Returns an iterator over all non-overlapping leftmost matches in the given bytes. If no match exists, then the iterator yields no elements.

This corresponds to the “standard” regex search iterator.

Errors

This iterator only yields errors if the search could not complete. For DFA-based regexes, this only occurs in a non-default configuration where quit bytes are used, Unicode word boundaries are heuristically enabled or limits are set on the number of times the lazy DFA’s cache may be cleared.

When a search cannot complete, callers cannot know whether a match exists or not.

The infallible (panics on error) version of this routine is find_leftmost_iter.

Returns an iterator over all overlapping matches in the given haystack.

This routine is principally useful when searching for multiple patterns on inputs where multiple patterns may match the same regions of text. The iterator takes care of handling the overlapping state that must be threaded through every search.

Errors

This iterator only yields errors if the search could not complete. For DFA-based regexes, this only occurs in a non-default configuration where quit bytes are used, Unicode word boundaries are heuristically enabled or limits are set on the number of times the lazy DFA’s cache may be cleared.

When a search cannot complete, callers cannot know whether a match exists or not.

The infallible (panics on error) version of this routine is find_overlapping_iter.

Lower level fallible search routines that permit controlling where the search starts and ends in a particular sequence.

Returns true if and only if this regex matches the given haystack.

This routine may short circuit if it knows that scanning future input will never lead to a different result. In particular, if the underlying DFA enters a match state or a dead state, then this routine will return true or false, respectively, without inspecting any future input.

Searching a substring of the haystack

Being an “at” search routine, this permits callers to search a substring of haystack by specifying a range in haystack. Why expose this as an API instead of just asking callers to use &input[start..end]? The reason is that regex matching often wants to take the surrounding context into account in order to handle look-around (^, $ and \b).

Errors

This routine only errors if the search could not complete. For DFA-based regexes, this only occurs in a non-default configuration where quit bytes are used, Unicode word boundaries are heuristically enabled or limits are set on the number of times the lazy DFA’s cache may be cleared.

When a search cannot complete, callers cannot know whether a match exists or not.

The infallible (panics on error) version of this routine is is_match_at.

Returns the first position at which a match is found.

This routine stops scanning input in precisely the same circumstances as is_match. The key difference is that this routine returns the position at which it stopped scanning input if and only if a match was found. If no match is found, then None is returned.

Searching a substring of the haystack

Being an “at” search routine, this permits callers to search a substring of haystack by specifying a range in haystack. Why expose this as an API instead of just asking callers to use &input[start..end]? The reason is that regex matching often wants to take the surrounding context into account in order to handle look-around (^, $ and \b).

This is useful when implementing an iterator over matches within the same haystack, which cannot be done correctly by simply providing a subslice of haystack.

Errors

This routine only errors if the search could not complete. For DFA-based regexes, this only occurs in a non-default configuration where quit bytes are used, Unicode word boundaries are heuristically enabled or limits are set on the number of times the lazy DFA’s cache may be cleared.

When a search cannot complete, callers cannot know whether a match exists or not.

The infallible (panics on error) version of this routine is find_earliest_at.

Returns the start and end offset of the leftmost match. If no match exists, then None is returned.

Searching a substring of the haystack

Being an “at” search routine, this permits callers to search a substring of haystack by specifying a range in haystack. Why expose this as an API instead of just asking callers to use &input[start..end]? The reason is that regex matching often wants to take the surrounding context into account in order to handle look-around (^, $ and \b).

This is useful when implementing an iterator over matches within the same haystack, which cannot be done correctly by simply providing a subslice of haystack.

Errors

This routine only errors if the search could not complete. For DFA-based regexes, this only occurs in a non-default configuration where quit bytes are used, Unicode word boundaries are heuristically enabled or limits are set on the number of times the lazy DFA’s cache may be cleared.

When a search cannot complete, callers cannot know whether a match exists or not.

The infallible (panics on error) version of this routine is find_leftmost_at.

Search for the first overlapping match within a given range of haystack.

This routine is principally useful when searching for multiple patterns on inputs where multiple patterns may match the same regions of text. In particular, callers must preserve the automaton’s search state from prior calls so that the implementation knows where the last match occurred and which pattern was reported.

Searching a substring of the haystack

Being an “at” search routine, this permits callers to search a substring of haystack by specifying a range in haystack. Why expose this as an API instead of just asking callers to use &input[start..end]? The reason is that regex matching often wants to take the surrounding context into account in order to handle look-around (^, $ and \b).

This is useful when implementing an iterator over matches within the same haystack, which cannot be done correctly by simply providing a subslice of haystack.

Errors

This routine only errors if the search could not complete. For DFA-based regexes, this only occurs in a non-default configuration where quit bytes are used, Unicode word boundaries are heuristically enabled or limits are set on the number of times the lazy DFA’s cache may be cleared.

When a search cannot complete, callers cannot know whether a match exists or not.

The infallible (panics on error) version of this routine is find_overlapping_at.

Non-search APIs for querying information about the regex and setting a prefilter.

Return the underlying lazy DFA responsible for forward matching.

This is useful for accessing the underlying lazy DFA and using it directly if the situation calls for it.

Return the underlying lazy DFA responsible for reverse matching.

This is useful for accessing the underlying lazy DFA and using it directly if the situation calls for it.

Returns the total number of patterns matched by this regex.

Example
use regex_automata::{MultiMatch, hybrid::regex::Regex};

let re = Regex::new_many(&[r"[a-z]+", r"[0-9]+", r"\w+"])?;
assert_eq!(3, re.pattern_count());

Convenience function for returning this regex’s prefilter as a trait object.

If this regex doesn’t have a prefilter, then None is returned.

Attach the given prefilter to this regex.

Trait Implementations

Formats the value using the given formatter. Read more

Auto Trait Implementations

Blanket Implementations

Gets the TypeId of self. Read more

Immutably borrows from an owned value. Read more

Mutably borrows from an owned value. Read more

Returns the argument unchanged.

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

The type returned in the event of a conversion error.

Performs the conversion.

The type returned in the event of a conversion error.

Performs the conversion.