pub struct Builder { /* private fields */ }
Expand description
A builder for a regex based on a hybrid NFA/DFA.
This builder permits configuring options for the syntax of a pattern, the NFA construction, the lazy DFA construction and finally the regex searching itself. This builder is different from a general purpose regex builder in that it permits fine grain configuration of the construction process. The trade off for this is complexity, and the possibility of setting a configuration that might not make sense. For example, there are three different UTF-8 modes:
SyntaxConfig::utf8
controls whether the pattern itself can contain sub-expressions that match invalid UTF-8.nfa::thompson::Config::utf8
controls whether the implicit unanchored prefix added to the NFA can match through invalid UTF-8 or not.Config::utf8
controls how the regex iterators themselves advance the starting position of the next search when a match with zero length is found.
Generally speaking, callers will want to either enable all of these or disable all of these.
Internally, building a regex requires building two hybrid NFA/DFAs,
where one is responsible for finding the end of a match and the other is
responsible for finding the start of a match. If you only need to detect
whether something matched, or only the end of a match, then you should use
a dfa::Builder
to construct a single hybrid NFA/DFA, which is cheaper
than building two of them.
Example
This example shows how to disable UTF-8 mode in the syntax, the NFA and the regex itself. This is generally what you want for matching on arbitrary bytes.
use regex_automata::{
hybrid::regex::Regex, nfa::thompson, MultiMatch, SyntaxConfig
};
let re = Regex::builder()
.configure(Regex::config().utf8(false))
.syntax(SyntaxConfig::new().utf8(false))
.thompson(thompson::Config::new().utf8(false))
.build(r"foo(?-u:[^b])ar.*")?;
let mut cache = re.create_cache();
let haystack = b"\xFEfoo\xFFarzz\xE2\x98\xFF\n";
let expected = Some(MultiMatch::must(0, 1, 9));
let got = re.find_leftmost(&mut cache, haystack);
assert_eq!(expected, got);
// Notice that `(?-u:[^b])` matches invalid UTF-8,
// but the subsequent `.*` does not! Disabling UTF-8
// on the syntax permits this. Notice also that the
// search was unanchored and skipped over invalid UTF-8.
// Disabling UTF-8 on the Thompson NFA permits this.
//
// N.B. This example does not show the impact of
// disabling UTF-8 mode on Config, since that
// only impacts regexes that can produce matches of
// length 0.
assert_eq!(b"foo\xFFarzz", &haystack[got.unwrap().range()]);
Implementations
sourceimpl Builder
impl Builder
sourcepub fn build(&self, pattern: &str) -> Result<Regex, BuildError>
pub fn build(&self, pattern: &str) -> Result<Regex, BuildError>
Build a regex from the given pattern.
If there was a problem parsing or compiling the pattern, then an error is returned.
sourcepub fn build_many<P: AsRef<str>>(
&self,
patterns: &[P]
) -> Result<Regex, BuildError>
pub fn build_many<P: AsRef<str>>(
&self,
patterns: &[P]
) -> Result<Regex, BuildError>
Build a regex from the given patterns.
sourcepub fn configure(&mut self, config: Config) -> &mut Builder
pub fn configure(&mut self, config: Config) -> &mut Builder
Apply the given regex configuration options to this builder.
sourcepub fn syntax(&mut self, config: SyntaxConfig) -> &mut Builder
pub fn syntax(&mut self, config: SyntaxConfig) -> &mut Builder
Set the syntax configuration for this builder using
SyntaxConfig
.
This permits setting things like case insensitivity, Unicode and multi line mode.
sourcepub fn thompson(&mut self, config: Config) -> &mut Builder
pub fn thompson(&mut self, config: Config) -> &mut Builder
Set the Thompson NFA configuration for this builder using
nfa::thompson::Config
.
This permits setting things like whether additional time should be spent shrinking the size of the NFA.
sourcepub fn dfa(&mut self, config: Config) -> &mut Builder
pub fn dfa(&mut self, config: Config) -> &mut Builder
Set the lazy DFA compilation configuration for this builder using
dfa::Config
.
This permits setting things like whether Unicode word boundaries should be heuristically supported or settings how the behavior of the cache.
Trait Implementations
Auto Trait Implementations
impl RefUnwindSafe for Builder
impl Send for Builder
impl Sync for Builder
impl Unpin for Builder
impl UnwindSafe for Builder
Blanket Implementations
sourceimpl<T> BorrowMut<T> for T where
T: ?Sized,
impl<T> BorrowMut<T> for T where
T: ?Sized,
const: unstable · sourcefn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more
sourceimpl<T> ToOwned for T where
T: Clone,
impl<T> ToOwned for T where
T: Clone,
type Owned = T
type Owned = T
The resulting type after obtaining ownership.
sourcefn clone_into(&self, target: &mut T)
fn clone_into(&self, target: &mut T)
toowned_clone_into
)Uses borrowed data to replace owned data, usually by cloning. Read more