pub struct ALLOWED_MATCHER { /* private fields */ }
Expand description
Determine if a script should be rendered in the browser by name.
Methods from Deref<Target = AhoCorasick>§
Sourcepub fn is_match<'h, I>(&self, input: I) -> bool
pub fn is_match<'h, I>(&self, input: I) -> bool
Returns true if and only if this automaton matches the haystack at any position.
input
may be any type that is cheaply convertible to an Input
. This
includes, but is not limited to, &str
and &[u8]
.
Aside from convenience, when AhoCorasick
was built with
leftmost-first or leftmost-longest semantics, this might result in a
search that visits less of the haystack than AhoCorasick::find
would otherwise. (For standard semantics, matches are always
immediately returned once they are seen, so there is no way for this to
do less work in that case.)
Note that there is no corresponding fallible routine for this method.
If you need a fallible version of this, then AhoCorasick::try_find
can be used with Input::earliest
enabled.
§Examples
Basic usage:
use aho_corasick::AhoCorasick;
let ac = AhoCorasick::new(&[
"foo", "bar", "quux", "baz",
]).unwrap();
assert!(ac.is_match("xxx bar xxx"));
assert!(!ac.is_match("xxx qux xxx"));
Sourcepub fn find<'h, I>(&self, input: I) -> Option<Match>
pub fn find<'h, I>(&self, input: I) -> Option<Match>
Returns the location of the first match according to the match semantics that this automaton was constructed with.
input
may be any type that is cheaply convertible to an Input
. This
includes, but is not limited to, &str
and &[u8]
.
This is the infallible version of AhoCorasick::try_find
.
§Panics
This panics when AhoCorasick::try_find
would return an error.
§Examples
Basic usage, with standard semantics:
use aho_corasick::{AhoCorasick, MatchKind};
let patterns = &["b", "abc", "abcd"];
let haystack = "abcd";
let ac = AhoCorasick::builder()
.match_kind(MatchKind::Standard) // default, not necessary
.build(patterns)
.unwrap();
let mat = ac.find(haystack).expect("should have a match");
assert_eq!("b", &haystack[mat.start()..mat.end()]);
Now with leftmost-first semantics:
use aho_corasick::{AhoCorasick, MatchKind};
let patterns = &["b", "abc", "abcd"];
let haystack = "abcd";
let ac = AhoCorasick::builder()
.match_kind(MatchKind::LeftmostFirst)
.build(patterns)
.unwrap();
let mat = ac.find(haystack).expect("should have a match");
assert_eq!("abc", &haystack[mat.start()..mat.end()]);
And finally, leftmost-longest semantics:
use aho_corasick::{AhoCorasick, MatchKind};
let patterns = &["b", "abc", "abcd"];
let haystack = "abcd";
let ac = AhoCorasick::builder()
.match_kind(MatchKind::LeftmostLongest)
.build(patterns)
.unwrap();
let mat = ac.find(haystack).expect("should have a match");
§Example: configuring a search
Because this method accepts anything that can be turned into an
Input
, it’s possible to provide an Input
directly in order to
configure the search. In this example, we show how to use the
earliest
option to force the search to return as soon as it knows
a match has occurred.
use aho_corasick::{AhoCorasick, Input, MatchKind};
let patterns = &["b", "abc", "abcd"];
let haystack = "abcd";
let ac = AhoCorasick::builder()
.match_kind(MatchKind::LeftmostLongest)
.build(patterns)
.unwrap();
let mat = ac.find(Input::new(haystack).earliest(true))
.expect("should have a match");
// The correct leftmost-longest match here is 'abcd', but since we
// told the search to quit as soon as it knows a match has occurred,
// we get a different match back.
assert_eq!("b", &haystack[mat.start()..mat.end()]);
Sourcepub fn find_overlapping<'h, I>(&self, input: I, state: &mut OverlappingState)
pub fn find_overlapping<'h, I>(&self, input: I, state: &mut OverlappingState)
Returns the location of the first overlapping match in the given input with respect to the current state of the underlying searcher.
input
may be any type that is cheaply convertible to an Input
. This
includes, but is not limited to, &str
and &[u8]
.
Overlapping searches do not report matches in their return value.
Instead, matches can be accessed via OverlappingState::get_match
after a search call.
This is the infallible version of
AhoCorasick::try_find_overlapping
.
§Panics
This panics when AhoCorasick::try_find_overlapping
would
return an error. For example, when the Aho-Corasick searcher
doesn’t support overlapping searches. (Only searchers built with
MatchKind::Standard
semantics support overlapping searches.)
§Example
This shows how we can repeatedly call an overlapping search without ever needing to explicitly re-slice the haystack. Overlapping search works this way because searches depend on state saved during the previous search.
use aho_corasick::{
automaton::OverlappingState,
AhoCorasick, Input, Match,
};
let patterns = &["append", "appendage", "app"];
let haystack = "append the app to the appendage";
let ac = AhoCorasick::new(patterns).unwrap();
let mut state = OverlappingState::start();
ac.find_overlapping(haystack, &mut state);
assert_eq!(Some(Match::must(2, 0..3)), state.get_match());
ac.find_overlapping(haystack, &mut state);
assert_eq!(Some(Match::must(0, 0..6)), state.get_match());
ac.find_overlapping(haystack, &mut state);
assert_eq!(Some(Match::must(2, 11..14)), state.get_match());
ac.find_overlapping(haystack, &mut state);
assert_eq!(Some(Match::must(2, 22..25)), state.get_match());
ac.find_overlapping(haystack, &mut state);
assert_eq!(Some(Match::must(0, 22..28)), state.get_match());
ac.find_overlapping(haystack, &mut state);
assert_eq!(Some(Match::must(1, 22..31)), state.get_match());
// No more match matches to be found.
ac.find_overlapping(haystack, &mut state);
assert_eq!(None, state.get_match());
Sourcepub fn find_iter<'a, 'h, I>(&'a self, input: I) -> FindIter<'a, 'h>
pub fn find_iter<'a, 'h, I>(&'a self, input: I) -> FindIter<'a, 'h>
Returns an iterator of non-overlapping matches, using the match semantics that this automaton was constructed with.
input
may be any type that is cheaply convertible to an Input
. This
includes, but is not limited to, &str
and &[u8]
.
This is the infallible version of AhoCorasick::try_find_iter
.
§Panics
This panics when AhoCorasick::try_find_iter
would return an error.
§Examples
Basic usage, with standard semantics:
use aho_corasick::{AhoCorasick, MatchKind, PatternID};
let patterns = &["append", "appendage", "app"];
let haystack = "append the app to the appendage";
let ac = AhoCorasick::builder()
.match_kind(MatchKind::Standard) // default, not necessary
.build(patterns)
.unwrap();
let matches: Vec<PatternID> = ac
.find_iter(haystack)
.map(|mat| mat.pattern())
.collect();
assert_eq!(vec![
PatternID::must(2),
PatternID::must(2),
PatternID::must(2),
], matches);
Now with leftmost-first semantics:
use aho_corasick::{AhoCorasick, MatchKind, PatternID};
let patterns = &["append", "appendage", "app"];
let haystack = "append the app to the appendage";
let ac = AhoCorasick::builder()
.match_kind(MatchKind::LeftmostFirst)
.build(patterns)
.unwrap();
let matches: Vec<PatternID> = ac
.find_iter(haystack)
.map(|mat| mat.pattern())
.collect();
assert_eq!(vec![
PatternID::must(0),
PatternID::must(2),
PatternID::must(0),
], matches);
And finally, leftmost-longest semantics:
use aho_corasick::{AhoCorasick, MatchKind, PatternID};
let patterns = &["append", "appendage", "app"];
let haystack = "append the app to the appendage";
let ac = AhoCorasick::builder()
.match_kind(MatchKind::LeftmostLongest)
.build(patterns)
.unwrap();
let matches: Vec<PatternID> = ac
.find_iter(haystack)
.map(|mat| mat.pattern())
.collect();
assert_eq!(vec![
PatternID::must(0),
PatternID::must(2),
PatternID::must(1),
], matches);
Sourcepub fn find_overlapping_iter<'a, 'h, I>(
&'a self,
input: I,
) -> FindOverlappingIter<'a, 'h>
pub fn find_overlapping_iter<'a, 'h, I>( &'a self, input: I, ) -> FindOverlappingIter<'a, 'h>
Returns an iterator of overlapping matches. Stated differently, this returns an iterator of all possible matches at every position.
input
may be any type that is cheaply convertible to an Input
. This
includes, but is not limited to, &str
and &[u8]
.
This is the infallible version of
AhoCorasick::try_find_overlapping_iter
.
§Panics
This panics when AhoCorasick::try_find_overlapping_iter
would return
an error. For example, when the Aho-Corasick searcher is built with
either leftmost-first or leftmost-longest match semantics. Stated
differently, overlapping searches require one to build the searcher
with MatchKind::Standard
(it is the default).
§Example: basic usage
use aho_corasick::{AhoCorasick, PatternID};
let patterns = &["append", "appendage", "app"];
let haystack = "append the app to the appendage";
let ac = AhoCorasick::new(patterns).unwrap();
let matches: Vec<PatternID> = ac
.find_overlapping_iter(haystack)
.map(|mat| mat.pattern())
.collect();
assert_eq!(vec![
PatternID::must(2),
PatternID::must(0),
PatternID::must(2),
PatternID::must(2),
PatternID::must(0),
PatternID::must(1),
], matches);
Sourcepub fn replace_all<B>(&self, haystack: &str, replace_with: &[B]) -> String
pub fn replace_all<B>(&self, haystack: &str, replace_with: &[B]) -> String
Replace all matches with a corresponding value in the replace_with
slice given. Matches correspond to the same matches as reported by
AhoCorasick::find_iter
.
Replacements are determined by the index of the matching pattern.
For example, if the pattern with index 2
is found, then it is
replaced by replace_with[2]
.
This is the infallible version of AhoCorasick::try_replace_all
.
§Panics
This panics when AhoCorasick::try_replace_all
would return an
error.
This also panics when replace_with.len()
does not equal
AhoCorasick::patterns_len
.
§Example: basic usage
use aho_corasick::{AhoCorasick, MatchKind};
let patterns = &["append", "appendage", "app"];
let haystack = "append the app to the appendage";
let ac = AhoCorasick::builder()
.match_kind(MatchKind::LeftmostFirst)
.build(patterns)
.unwrap();
let result = ac.replace_all(haystack, &["x", "y", "z"]);
assert_eq!("x the z to the xage", result);
Sourcepub fn replace_all_bytes<B>(
&self,
haystack: &[u8],
replace_with: &[B],
) -> Vec<u8> ⓘ
pub fn replace_all_bytes<B>( &self, haystack: &[u8], replace_with: &[B], ) -> Vec<u8> ⓘ
Replace all matches using raw bytes with a corresponding value in the
replace_with
slice given. Matches correspond to the same matches as
reported by AhoCorasick::find_iter
.
Replacements are determined by the index of the matching pattern.
For example, if the pattern with index 2
is found, then it is
replaced by replace_with[2]
.
This is the infallible version of
AhoCorasick::try_replace_all_bytes
.
§Panics
This panics when AhoCorasick::try_replace_all_bytes
would return an
error.
This also panics when replace_with.len()
does not equal
AhoCorasick::patterns_len
.
§Example: basic usage
use aho_corasick::{AhoCorasick, MatchKind};
let patterns = &["append", "appendage", "app"];
let haystack = b"append the app to the appendage";
let ac = AhoCorasick::builder()
.match_kind(MatchKind::LeftmostFirst)
.build(patterns)
.unwrap();
let result = ac.replace_all_bytes(haystack, &["x", "y", "z"]);
assert_eq!(b"x the z to the xage".to_vec(), result);
Sourcepub fn replace_all_with<F>(
&self,
haystack: &str,
dst: &mut String,
replace_with: F,
)
pub fn replace_all_with<F>( &self, haystack: &str, dst: &mut String, replace_with: F, )
Replace all matches using a closure called on each match.
Matches correspond to the same matches as reported by
AhoCorasick::find_iter
.
The closure accepts three parameters: the match found, the text of
the match and a string buffer with which to write the replaced text
(if any). If the closure returns true
, then it continues to the next
match. If the closure returns false
, then searching is stopped.
Note that any matches with boundaries that don’t fall on a valid UTF-8 boundary are silently skipped.
This is the infallible version of
AhoCorasick::try_replace_all_with
.
§Panics
This panics when AhoCorasick::try_replace_all_with
would return an
error.
§Examples
Basic usage:
use aho_corasick::{AhoCorasick, MatchKind};
let patterns = &["append", "appendage", "app"];
let haystack = "append the app to the appendage";
let ac = AhoCorasick::builder()
.match_kind(MatchKind::LeftmostFirst)
.build(patterns)
.unwrap();
let mut result = String::new();
ac.replace_all_with(haystack, &mut result, |mat, _, dst| {
dst.push_str(&mat.pattern().as_usize().to_string());
true
});
assert_eq!("0 the 2 to the 0age", result);
Stopping the replacement by returning false
(continued from the
example above):
let mut result = String::new();
ac.replace_all_with(haystack, &mut result, |mat, _, dst| {
dst.push_str(&mat.pattern().as_usize().to_string());
mat.pattern() != PatternID::must(2)
});
assert_eq!("0 the 2 to the appendage", result);
Sourcepub fn replace_all_with_bytes<F>(
&self,
haystack: &[u8],
dst: &mut Vec<u8>,
replace_with: F,
)
pub fn replace_all_with_bytes<F>( &self, haystack: &[u8], dst: &mut Vec<u8>, replace_with: F, )
Replace all matches using raw bytes with a closure called on each
match. Matches correspond to the same matches as reported by
AhoCorasick::find_iter
.
The closure accepts three parameters: the match found, the text of
the match and a byte buffer with which to write the replaced text
(if any). If the closure returns true
, then it continues to the next
match. If the closure returns false
, then searching is stopped.
This is the infallible version of
AhoCorasick::try_replace_all_with_bytes
.
§Panics
This panics when AhoCorasick::try_replace_all_with_bytes
would
return an error.
§Examples
Basic usage:
use aho_corasick::{AhoCorasick, MatchKind};
let patterns = &["append", "appendage", "app"];
let haystack = b"append the app to the appendage";
let ac = AhoCorasick::builder()
.match_kind(MatchKind::LeftmostFirst)
.build(patterns)
.unwrap();
let mut result = vec![];
ac.replace_all_with_bytes(haystack, &mut result, |mat, _, dst| {
dst.extend(mat.pattern().as_usize().to_string().bytes());
true
});
assert_eq!(b"0 the 2 to the 0age".to_vec(), result);
Stopping the replacement by returning false
(continued from the
example above):
let mut result = vec![];
ac.replace_all_with_bytes(haystack, &mut result, |mat, _, dst| {
dst.extend(mat.pattern().as_usize().to_string().bytes());
mat.pattern() != PatternID::must(2)
});
assert_eq!(b"0 the 2 to the appendage".to_vec(), result);
Sourcepub fn stream_find_iter<'a, R>(&'a self, rdr: R) -> StreamFindIter<'a, R>where
R: Read,
pub fn stream_find_iter<'a, R>(&'a self, rdr: R) -> StreamFindIter<'a, R>where
R: Read,
Returns an iterator of non-overlapping matches in the given
stream. Matches correspond to the same matches as reported by
AhoCorasick::find_iter
.
The matches yielded by this iterator use absolute position offsets in
the stream given, where the first byte has index 0
. Matches are
yieled until the stream is exhausted.
Each item yielded by the iterator is an Result<Match, std::io::Error>
, where an error is yielded if there was a problem
reading from the reader given.
When searching a stream, an internal buffer is used. Therefore, callers should avoiding providing a buffered reader, if possible.
This is the infallible version of
AhoCorasick::try_stream_find_iter
. Note that both methods return
iterators that produce Result
values. The difference is that this
routine panics if construction of the iterator failed. The Result
values yield by the iterator come from whether the given reader returns
an error or not during the search.
§Memory usage
In general, searching streams will use a constant amount of memory for its internal buffer. The one requirement is that the internal buffer must be at least the size of the longest possible match. In most use cases, the default buffer size will be much larger than any individual match.
§Panics
This panics when AhoCorasick::try_stream_find_iter
would return
an error. For example, when the Aho-Corasick searcher doesn’t support
stream searches. (Only searchers built with MatchKind::Standard
semantics support stream searches.)
§Example: basic usage
use aho_corasick::{AhoCorasick, PatternID};
let patterns = &["append", "appendage", "app"];
let haystack = "append the app to the appendage";
let ac = AhoCorasick::new(patterns).unwrap();
let mut matches = vec![];
for result in ac.stream_find_iter(haystack.as_bytes()) {
let mat = result?;
matches.push(mat.pattern());
}
assert_eq!(vec![
PatternID::must(2),
PatternID::must(2),
PatternID::must(2),
], matches);
Sourcepub fn try_find<'h, I>(&self, input: I) -> Result<Option<Match>, MatchError>
pub fn try_find<'h, I>(&self, input: I) -> Result<Option<Match>, MatchError>
Returns the location of the first match according to the match
semantics that this automaton was constructed with, and according
to the given Input
configuration.
This is the fallible version of AhoCorasick::find
.
§Errors
This returns an error when this Aho-Corasick searcher does not support
the given Input
configuration.
For example, if the Aho-Corasick searcher only supports anchored
searches or only supports unanchored searches, then providing an
Input
that requests an anchored (or unanchored) search when it isn’t
supported would result in an error.
§Example: leftmost-first searching
Basic usage with leftmost-first semantics:
use aho_corasick::{AhoCorasick, MatchKind, Input};
let patterns = &["b", "abc", "abcd"];
let haystack = "foo abcd";
let ac = AhoCorasick::builder()
.match_kind(MatchKind::LeftmostFirst)
.build(patterns)
.unwrap();
let mat = ac.try_find(haystack)?.expect("should have a match");
assert_eq!("abc", &haystack[mat.span()]);
§Example: anchored leftmost-first searching
This shows how to anchor the search, so that even if the haystack contains a match somewhere, a match won’t be reported unless one can be found that starts at the beginning of the search:
use aho_corasick::{AhoCorasick, Anchored, Input, MatchKind, StartKind};
let patterns = &["b", "abc", "abcd"];
let haystack = "foo abcd";
let ac = AhoCorasick::builder()
.match_kind(MatchKind::LeftmostFirst)
.start_kind(StartKind::Anchored)
.build(patterns)
.unwrap();
let input = Input::new(haystack).anchored(Anchored::Yes);
assert_eq!(None, ac.try_find(input)?);
If the beginning of the search is changed to where a match begins, then it will be found:
use aho_corasick::{AhoCorasick, Anchored, Input, MatchKind, StartKind};
let patterns = &["b", "abc", "abcd"];
let haystack = "foo abcd";
let ac = AhoCorasick::builder()
.match_kind(MatchKind::LeftmostFirst)
.start_kind(StartKind::Anchored)
.build(patterns)
.unwrap();
let input = Input::new(haystack).range(4..).anchored(Anchored::Yes);
let mat = ac.try_find(input)?.expect("should have a match");
assert_eq!("abc", &haystack[mat.span()]);
§Example: earliest leftmost-first searching
This shows how to run an “earliest” search even when the Aho-Corasick searcher was compiled with leftmost-first match semantics. In this case, the search is stopped as soon as it is known that a match has occurred, even if it doesn’t correspond to the leftmost-first match.
use aho_corasick::{AhoCorasick, Input, MatchKind};
let patterns = &["b", "abc", "abcd"];
let haystack = "foo abcd";
let ac = AhoCorasick::builder()
.match_kind(MatchKind::LeftmostFirst)
.build(patterns)
.unwrap();
let input = Input::new(haystack).earliest(true);
let mat = ac.try_find(input)?.expect("should have a match");
assert_eq!("b", &haystack[mat.span()]);
Sourcepub fn try_find_overlapping<'h, I>(
&self,
input: I,
state: &mut OverlappingState,
) -> Result<(), MatchError>
pub fn try_find_overlapping<'h, I>( &self, input: I, state: &mut OverlappingState, ) -> Result<(), MatchError>
Returns the location of the first overlapping match in the given input with respect to the current state of the underlying searcher.
Overlapping searches do not report matches in their return value.
Instead, matches can be accessed via OverlappingState::get_match
after a search call.
This is the fallible version of AhoCorasick::find_overlapping
.
§Errors
This returns an error when this Aho-Corasick searcher does not support
the given Input
configuration or if overlapping search is not
supported.
One example is that only Aho-Corasicker searchers built with
MatchKind::Standard
semantics support overlapping searches. Using
any other match semantics will result in this returning an error.
§Example: basic usage
This shows how we can repeatedly call an overlapping search without ever needing to explicitly re-slice the haystack. Overlapping search works this way because searches depend on state saved during the previous search.
use aho_corasick::{
automaton::OverlappingState,
AhoCorasick, Input, Match,
};
let patterns = &["append", "appendage", "app"];
let haystack = "append the app to the appendage";
let ac = AhoCorasick::new(patterns).unwrap();
let mut state = OverlappingState::start();
ac.try_find_overlapping(haystack, &mut state)?;
assert_eq!(Some(Match::must(2, 0..3)), state.get_match());
ac.try_find_overlapping(haystack, &mut state)?;
assert_eq!(Some(Match::must(0, 0..6)), state.get_match());
ac.try_find_overlapping(haystack, &mut state)?;
assert_eq!(Some(Match::must(2, 11..14)), state.get_match());
ac.try_find_overlapping(haystack, &mut state)?;
assert_eq!(Some(Match::must(2, 22..25)), state.get_match());
ac.try_find_overlapping(haystack, &mut state)?;
assert_eq!(Some(Match::must(0, 22..28)), state.get_match());
ac.try_find_overlapping(haystack, &mut state)?;
assert_eq!(Some(Match::must(1, 22..31)), state.get_match());
// No more match matches to be found.
ac.try_find_overlapping(haystack, &mut state)?;
assert_eq!(None, state.get_match());
§Example: implementing your own overlapping iteration
The previous example can be easily adapted to implement your own
iteration by repeatedly calling try_find_overlapping
until either
an error occurs or no more matches are reported.
This is effectively equivalent to the iterator returned by
AhoCorasick::try_find_overlapping_iter
, with the only difference
being that the iterator checks for errors before construction and
absolves the caller of needing to check for errors on every search
call. (Indeed, if the first try_find_overlapping
call succeeds and
the same Input
is given to subsequent calls, then all subsequent
calls are guaranteed to succeed.)
use aho_corasick::{
automaton::OverlappingState,
AhoCorasick, Input, Match,
};
let patterns = &["append", "appendage", "app"];
let haystack = "append the app to the appendage";
let ac = AhoCorasick::new(patterns).unwrap();
let mut state = OverlappingState::start();
let mut matches = vec![];
loop {
ac.try_find_overlapping(haystack, &mut state)?;
let mat = match state.get_match() {
None => break,
Some(mat) => mat,
};
matches.push(mat);
}
let expected = vec![
Match::must(2, 0..3),
Match::must(0, 0..6),
Match::must(2, 11..14),
Match::must(2, 22..25),
Match::must(0, 22..28),
Match::must(1, 22..31),
];
assert_eq!(expected, matches);
§Example: anchored iteration
The previous example can also be adapted to implement
iteration over all anchored matches. In particular,
AhoCorasick::try_find_overlapping_iter
does not support this
because it isn’t totally clear what the match semantics ought to be.
In this example, we will find all overlapping matches that start at the beginning of our search.
use aho_corasick::{
automaton::OverlappingState,
AhoCorasick, Anchored, Input, Match, StartKind,
};
let patterns = &["append", "appendage", "app"];
let haystack = "append the app to the appendage";
let ac = AhoCorasick::builder()
.start_kind(StartKind::Anchored)
.build(patterns)
.unwrap();
let input = Input::new(haystack).anchored(Anchored::Yes);
let mut state = OverlappingState::start();
let mut matches = vec![];
loop {
ac.try_find_overlapping(input.clone(), &mut state)?;
let mat = match state.get_match() {
None => break,
Some(mat) => mat,
};
matches.push(mat);
}
let expected = vec![
Match::must(2, 0..3),
Match::must(0, 0..6),
];
assert_eq!(expected, matches);
Sourcepub fn try_find_iter<'a, 'h, I>(
&'a self,
input: I,
) -> Result<FindIter<'a, 'h>, MatchError>
pub fn try_find_iter<'a, 'h, I>( &'a self, input: I, ) -> Result<FindIter<'a, 'h>, MatchError>
Returns an iterator of non-overlapping matches, using the match semantics that this automaton was constructed with.
This is the fallible version of AhoCorasick::find_iter
.
Note that the error returned by this method occurs during construction
of the iterator. The iterator itself yields Match
values. That is,
once the iterator is constructed, the iteration itself will never
report an error.
§Errors
This returns an error when this Aho-Corasick searcher does not support
the given Input
configuration.
For example, if the Aho-Corasick searcher only supports anchored
searches or only supports unanchored searches, then providing an
Input
that requests an anchored (or unanchored) search when it isn’t
supported would result in an error.
§Example: leftmost-first searching
Basic usage with leftmost-first semantics:
use aho_corasick::{AhoCorasick, Input, MatchKind, PatternID};
let patterns = &["append", "appendage", "app"];
let haystack = "append the app to the appendage";
let ac = AhoCorasick::builder()
.match_kind(MatchKind::LeftmostFirst)
.build(patterns)
.unwrap();
let matches: Vec<PatternID> = ac
.try_find_iter(Input::new(haystack))?
.map(|mat| mat.pattern())
.collect();
assert_eq!(vec![
PatternID::must(0),
PatternID::must(2),
PatternID::must(0),
], matches);
§Example: anchored leftmost-first searching
This shows how to anchor the search, such that all matches must begin at the starting location of the search. For an iterator, an anchored search implies that all matches are adjacent.
use aho_corasick::{
AhoCorasick, Anchored, Input, MatchKind, PatternID, StartKind,
};
let patterns = &["foo", "bar", "quux"];
let haystack = "fooquuxbar foo";
let ac = AhoCorasick::builder()
.match_kind(MatchKind::LeftmostFirst)
.start_kind(StartKind::Anchored)
.build(patterns)
.unwrap();
let matches: Vec<PatternID> = ac
.try_find_iter(Input::new(haystack).anchored(Anchored::Yes))?
.map(|mat| mat.pattern())
.collect();
assert_eq!(vec![
PatternID::must(0),
PatternID::must(2),
PatternID::must(1),
// The final 'foo' is not found because it is not adjacent to the
// 'bar' match. It needs to be adjacent because our search is
// anchored.
], matches);
Sourcepub fn try_find_overlapping_iter<'a, 'h, I>(
&'a self,
input: I,
) -> Result<FindOverlappingIter<'a, 'h>, MatchError>
pub fn try_find_overlapping_iter<'a, 'h, I>( &'a self, input: I, ) -> Result<FindOverlappingIter<'a, 'h>, MatchError>
Returns an iterator of overlapping matches.
This is the fallible version of AhoCorasick::find_overlapping_iter
.
Note that the error returned by this method occurs during construction
of the iterator. The iterator itself yields Match
values. That is,
once the iterator is constructed, the iteration itself will never
report an error.
§Errors
This returns an error when this Aho-Corasick searcher does not support
the given Input
configuration or does not support overlapping
searches.
One example is that only Aho-Corasicker searchers built with
MatchKind::Standard
semantics support overlapping searches. Using
any other match semantics will result in this returning an error.
§Example: basic usage
use aho_corasick::{AhoCorasick, Input, PatternID};
let patterns = &["append", "appendage", "app"];
let haystack = "append the app to the appendage";
let ac = AhoCorasick::new(patterns).unwrap();
let matches: Vec<PatternID> = ac
.try_find_overlapping_iter(Input::new(haystack))?
.map(|mat| mat.pattern())
.collect();
assert_eq!(vec![
PatternID::must(2),
PatternID::must(0),
PatternID::must(2),
PatternID::must(2),
PatternID::must(0),
PatternID::must(1),
], matches);
§Example: anchored overlapping search returns an error
It isn’t clear what the match semantics for anchored overlapping
iterators ought to be, so currently an error is returned. Callers
may use AhoCorasick::try_find_overlapping
to implement their own
semantics if desired.
use aho_corasick::{AhoCorasick, Anchored, Input, StartKind};
let patterns = &["append", "appendage", "app"];
let haystack = "appendappendage app";
let ac = AhoCorasick::builder()
.start_kind(StartKind::Anchored)
.build(patterns)
.unwrap();
let input = Input::new(haystack).anchored(Anchored::Yes);
assert!(ac.try_find_overlapping_iter(input).is_err());
Sourcepub fn try_replace_all<B>(
&self,
haystack: &str,
replace_with: &[B],
) -> Result<String, MatchError>
pub fn try_replace_all<B>( &self, haystack: &str, replace_with: &[B], ) -> Result<String, MatchError>
Replace all matches with a corresponding value in the replace_with
slice given. Matches correspond to the same matches as reported by
AhoCorasick::try_find_iter
.
Replacements are determined by the index of the matching pattern.
For example, if the pattern with index 2
is found, then it is
replaced by replace_with[2]
.
§Panics
This panics when replace_with.len()
does not equal
AhoCorasick::patterns_len
.
§Errors
This returns an error when this Aho-Corasick searcher does not support
the default Input
configuration. More specifically, this occurs only
when the Aho-Corasick searcher does not support unanchored searches
since this replacement routine always does an unanchored search.
§Example: basic usage
use aho_corasick::{AhoCorasick, MatchKind};
let patterns = &["append", "appendage", "app"];
let haystack = "append the app to the appendage";
let ac = AhoCorasick::builder()
.match_kind(MatchKind::LeftmostFirst)
.build(patterns)
.unwrap();
let result = ac.try_replace_all(haystack, &["x", "y", "z"])?;
assert_eq!("x the z to the xage", result);
Sourcepub fn try_replace_all_bytes<B>(
&self,
haystack: &[u8],
replace_with: &[B],
) -> Result<Vec<u8>, MatchError>
pub fn try_replace_all_bytes<B>( &self, haystack: &[u8], replace_with: &[B], ) -> Result<Vec<u8>, MatchError>
Replace all matches using raw bytes with a corresponding value in the
replace_with
slice given. Matches correspond to the same matches as
reported by AhoCorasick::try_find_iter
.
Replacements are determined by the index of the matching pattern.
For example, if the pattern with index 2
is found, then it is
replaced by replace_with[2]
.
This is the fallible version of AhoCorasick::replace_all_bytes
.
§Panics
This panics when replace_with.len()
does not equal
AhoCorasick::patterns_len
.
§Errors
This returns an error when this Aho-Corasick searcher does not support
the default Input
configuration. More specifically, this occurs only
when the Aho-Corasick searcher does not support unanchored searches
since this replacement routine always does an unanchored search.
§Example: basic usage
use aho_corasick::{AhoCorasick, MatchKind};
let patterns = &["append", "appendage", "app"];
let haystack = b"append the app to the appendage";
let ac = AhoCorasick::builder()
.match_kind(MatchKind::LeftmostFirst)
.build(patterns)
.unwrap();
let result = ac.try_replace_all_bytes(haystack, &["x", "y", "z"])?;
assert_eq!(b"x the z to the xage".to_vec(), result);
Sourcepub fn try_replace_all_with<F>(
&self,
haystack: &str,
dst: &mut String,
replace_with: F,
) -> Result<(), MatchError>
pub fn try_replace_all_with<F>( &self, haystack: &str, dst: &mut String, replace_with: F, ) -> Result<(), MatchError>
Replace all matches using a closure called on each match.
Matches correspond to the same matches as reported by
AhoCorasick::try_find_iter
.
The closure accepts three parameters: the match found, the text of
the match and a string buffer with which to write the replaced text
(if any). If the closure returns true
, then it continues to the next
match. If the closure returns false
, then searching is stopped.
Note that any matches with boundaries that don’t fall on a valid UTF-8 boundary are silently skipped.
This is the fallible version of AhoCorasick::replace_all_with
.
§Errors
This returns an error when this Aho-Corasick searcher does not support
the default Input
configuration. More specifically, this occurs only
when the Aho-Corasick searcher does not support unanchored searches
since this replacement routine always does an unanchored search.
§Examples
Basic usage:
use aho_corasick::{AhoCorasick, MatchKind};
let patterns = &["append", "appendage", "app"];
let haystack = "append the app to the appendage";
let ac = AhoCorasick::builder()
.match_kind(MatchKind::LeftmostFirst)
.build(patterns)
.unwrap();
let mut result = String::new();
ac.try_replace_all_with(haystack, &mut result, |mat, _, dst| {
dst.push_str(&mat.pattern().as_usize().to_string());
true
})?;
assert_eq!("0 the 2 to the 0age", result);
Stopping the replacement by returning false
(continued from the
example above):
let mut result = String::new();
ac.try_replace_all_with(haystack, &mut result, |mat, _, dst| {
dst.push_str(&mat.pattern().as_usize().to_string());
mat.pattern() != PatternID::must(2)
})?;
assert_eq!("0 the 2 to the appendage", result);
Sourcepub fn try_replace_all_with_bytes<F>(
&self,
haystack: &[u8],
dst: &mut Vec<u8>,
replace_with: F,
) -> Result<(), MatchError>
pub fn try_replace_all_with_bytes<F>( &self, haystack: &[u8], dst: &mut Vec<u8>, replace_with: F, ) -> Result<(), MatchError>
Replace all matches using raw bytes with a closure called on each
match. Matches correspond to the same matches as reported by
AhoCorasick::try_find_iter
.
The closure accepts three parameters: the match found, the text of
the match and a byte buffer with which to write the replaced text
(if any). If the closure returns true
, then it continues to the next
match. If the closure returns false
, then searching is stopped.
This is the fallible version of
AhoCorasick::replace_all_with_bytes
.
§Errors
This returns an error when this Aho-Corasick searcher does not support
the default Input
configuration. More specifically, this occurs only
when the Aho-Corasick searcher does not support unanchored searches
since this replacement routine always does an unanchored search.
§Examples
Basic usage:
use aho_corasick::{AhoCorasick, MatchKind};
let patterns = &["append", "appendage", "app"];
let haystack = b"append the app to the appendage";
let ac = AhoCorasick::builder()
.match_kind(MatchKind::LeftmostFirst)
.build(patterns)
.unwrap();
let mut result = vec![];
ac.try_replace_all_with_bytes(haystack, &mut result, |mat, _, dst| {
dst.extend(mat.pattern().as_usize().to_string().bytes());
true
})?;
assert_eq!(b"0 the 2 to the 0age".to_vec(), result);
Stopping the replacement by returning false
(continued from the
example above):
let mut result = vec![];
ac.try_replace_all_with_bytes(haystack, &mut result, |mat, _, dst| {
dst.extend(mat.pattern().as_usize().to_string().bytes());
mat.pattern() != PatternID::must(2)
})?;
assert_eq!(b"0 the 2 to the appendage".to_vec(), result);
Sourcepub fn try_stream_find_iter<'a, R>(
&'a self,
rdr: R,
) -> Result<StreamFindIter<'a, R>, MatchError>where
R: Read,
pub fn try_stream_find_iter<'a, R>(
&'a self,
rdr: R,
) -> Result<StreamFindIter<'a, R>, MatchError>where
R: Read,
Returns an iterator of non-overlapping matches in the given
stream. Matches correspond to the same matches as reported by
AhoCorasick::try_find_iter
.
The matches yielded by this iterator use absolute position offsets in
the stream given, where the first byte has index 0
. Matches are
yieled until the stream is exhausted.
Each item yielded by the iterator is an Result<Match, std::io::Error>
, where an error is yielded if there was a problem
reading from the reader given.
When searching a stream, an internal buffer is used. Therefore, callers should avoiding providing a buffered reader, if possible.
This is the fallible version of AhoCorasick::stream_find_iter
.
Note that both methods return iterators that produce Result
values.
The difference is that this routine returns an error if construction
of the iterator failed. The Result
values yield by the iterator
come from whether the given reader returns an error or not during the
search.
§Memory usage
In general, searching streams will use a constant amount of memory for its internal buffer. The one requirement is that the internal buffer must be at least the size of the longest possible match. In most use cases, the default buffer size will be much larger than any individual match.
§Errors
This returns an error when this Aho-Corasick searcher does not support
the default Input
configuration. More specifically, this occurs only
when the Aho-Corasick searcher does not support unanchored searches
since this stream searching routine always does an unanchored search.
This also returns an error if the searcher does not support stream
searches. Only searchers built with MatchKind::Standard
semantics
support stream searches.
§Example: basic usage
use aho_corasick::{AhoCorasick, PatternID};
let patterns = &["append", "appendage", "app"];
let haystack = "append the app to the appendage";
let ac = AhoCorasick::new(patterns).unwrap();
let mut matches = vec![];
for result in ac.try_stream_find_iter(haystack.as_bytes())? {
let mat = result?;
matches.push(mat.pattern());
}
assert_eq!(vec![
PatternID::must(2),
PatternID::must(2),
PatternID::must(2),
], matches);
Sourcepub fn try_stream_replace_all<R, W, B>(
&self,
rdr: R,
wtr: W,
replace_with: &[B],
) -> Result<(), Error>
pub fn try_stream_replace_all<R, W, B>( &self, rdr: R, wtr: W, replace_with: &[B], ) -> Result<(), Error>
Search for and replace all matches of this automaton in
the given reader, and write the replacements to the given
writer. Matches correspond to the same matches as reported by
AhoCorasick::try_find_iter
.
Replacements are determined by the index of the matching pattern. For
example, if the pattern with index 2
is found, then it is replaced by
replace_with[2]
.
After all matches are replaced, the writer is not flushed.
If there was a problem reading from the given reader or writing to the
given writer, then the corresponding io::Error
is returned and all
replacement is stopped.
When searching a stream, an internal buffer is used. Therefore, callers should avoiding providing a buffered reader, if possible. However, callers may want to provide a buffered writer.
Note that there is currently no infallible version of this routine.
§Memory usage
In general, searching streams will use a constant amount of memory for its internal buffer. The one requirement is that the internal buffer must be at least the size of the longest possible match. In most use cases, the default buffer size will be much larger than any individual match.
§Panics
This panics when replace_with.len()
does not equal
AhoCorasick::patterns_len
.
§Errors
This returns an error when this Aho-Corasick searcher does not support
the default Input
configuration. More specifically, this occurs only
when the Aho-Corasick searcher does not support unanchored searches
since this stream searching routine always does an unanchored search.
This also returns an error if the searcher does not support stream
searches. Only searchers built with MatchKind::Standard
semantics
support stream searches.
§Example: basic usage
use aho_corasick::AhoCorasick;
let patterns = &["fox", "brown", "quick"];
let haystack = "The quick brown fox.";
let replace_with = &["sloth", "grey", "slow"];
let ac = AhoCorasick::new(patterns).unwrap();
let mut result = vec![];
ac.try_stream_replace_all(
haystack.as_bytes(),
&mut result,
replace_with,
)?;
assert_eq!(b"The slow grey sloth.".to_vec(), result);
Sourcepub fn try_stream_replace_all_with<R, W, F>(
&self,
rdr: R,
wtr: W,
replace_with: F,
) -> Result<(), Error>
pub fn try_stream_replace_all_with<R, W, F>( &self, rdr: R, wtr: W, replace_with: F, ) -> Result<(), Error>
Search the given reader and replace all matches of this automaton
using the given closure. The result is written to the given
writer. Matches correspond to the same matches as reported by
AhoCorasick::try_find_iter
.
The closure accepts three parameters: the match found, the text of the match and the writer with which to write the replaced text (if any).
After all matches are replaced, the writer is not flushed.
If there was a problem reading from the given reader or writing to the
given writer, then the corresponding io::Error
is returned and all
replacement is stopped.
When searching a stream, an internal buffer is used. Therefore, callers should avoiding providing a buffered reader, if possible. However, callers may want to provide a buffered writer.
Note that there is currently no infallible version of this routine.
§Memory usage
In general, searching streams will use a constant amount of memory for its internal buffer. The one requirement is that the internal buffer must be at least the size of the longest possible match. In most use cases, the default buffer size will be much larger than any individual match.
§Errors
This returns an error when this Aho-Corasick searcher does not support
the default Input
configuration. More specifically, this occurs only
when the Aho-Corasick searcher does not support unanchored searches
since this stream searching routine always does an unanchored search.
This also returns an error if the searcher does not support stream
searches. Only searchers built with MatchKind::Standard
semantics
support stream searches.
§Example: basic usage
use std::io::Write;
use aho_corasick::AhoCorasick;
let patterns = &["fox", "brown", "quick"];
let haystack = "The quick brown fox.";
let ac = AhoCorasick::new(patterns).unwrap();
let mut result = vec![];
ac.try_stream_replace_all_with(
haystack.as_bytes(),
&mut result,
|mat, _, wtr| {
wtr.write_all(mat.pattern().as_usize().to_string().as_bytes())
},
)?;
assert_eq!(b"The 2 1 0.".to_vec(), result);
Sourcepub fn kind(&self) -> AhoCorasickKind
pub fn kind(&self) -> AhoCorasickKind
Returns the kind of the Aho-Corasick automaton used by this searcher.
Knowing the Aho-Corasick kind is principally useful for diagnostic
purposes. In particular, if no specific kind was given to
AhoCorasickBuilder::kind
, then one is automatically chosen and
this routine will report which one.
Note that the heuristics used for choosing which AhoCorasickKind
may be changed in a semver compatible release.
§Examples
use aho_corasick::{AhoCorasick, AhoCorasickKind};
let ac = AhoCorasick::new(&["foo", "bar", "quux", "baz"]).unwrap();
// The specific Aho-Corasick kind chosen is not guaranteed!
assert_eq!(AhoCorasickKind::DFA, ac.kind());
Sourcepub fn start_kind(&self) -> StartKind
pub fn start_kind(&self) -> StartKind
Returns the type of starting search configuration supported by this Aho-Corasick automaton.
§Examples
use aho_corasick::{AhoCorasick, StartKind};
let ac = AhoCorasick::new(&["foo", "bar", "quux", "baz"]).unwrap();
assert_eq!(StartKind::Unanchored, ac.start_kind());
Sourcepub fn match_kind(&self) -> MatchKind
pub fn match_kind(&self) -> MatchKind
Returns the match kind used by this automaton.
The match kind is important because it determines what kinds of
matches are returned. Also, some operations (such as overlapping
search and stream searching) are only supported when using the
MatchKind::Standard
match kind.
§Examples
use aho_corasick::{AhoCorasick, MatchKind};
let ac = AhoCorasick::new(&["foo", "bar", "quux", "baz"]).unwrap();
assert_eq!(MatchKind::Standard, ac.match_kind());
Sourcepub fn min_pattern_len(&self) -> usize
pub fn min_pattern_len(&self) -> usize
Returns the length of the shortest pattern matched by this automaton.
§Examples
Basic usage:
use aho_corasick::AhoCorasick;
let ac = AhoCorasick::new(&["foo", "bar", "quux", "baz"]).unwrap();
assert_eq!(3, ac.min_pattern_len());
Note that an AhoCorasick
automaton has a minimum length of 0
if
and only if it can match the empty string:
use aho_corasick::AhoCorasick;
let ac = AhoCorasick::new(&["foo", "", "quux", "baz"]).unwrap();
assert_eq!(0, ac.min_pattern_len());
Sourcepub fn max_pattern_len(&self) -> usize
pub fn max_pattern_len(&self) -> usize
Returns the length of the longest pattern matched by this automaton.
§Examples
Basic usage:
use aho_corasick::AhoCorasick;
let ac = AhoCorasick::new(&["foo", "bar", "quux", "baz"]).unwrap();
assert_eq!(4, ac.max_pattern_len());
Sourcepub fn patterns_len(&self) -> usize
pub fn patterns_len(&self) -> usize
Return the total number of patterns matched by this automaton.
This includes patterns that may never participate in a match. For
example, if MatchKind::LeftmostFirst
match semantics are used, and
the patterns Sam
and Samwise
were used to build the automaton (in
that order), then Samwise
can never participate in a match because
Sam
will always take priority.
§Examples
Basic usage:
use aho_corasick::AhoCorasick;
let ac = AhoCorasick::new(&["foo", "bar", "baz"]).unwrap();
assert_eq!(3, ac.patterns_len());
Sourcepub fn memory_usage(&self) -> usize
pub fn memory_usage(&self) -> usize
Returns the approximate total amount of heap used by this automaton, in units of bytes.
§Examples
This example shows the difference in heap usage between a few configurations:
use aho_corasick::{AhoCorasick, AhoCorasickKind, MatchKind};
let ac = AhoCorasick::builder()
.kind(None) // default
.build(&["foobar", "bruce", "triskaidekaphobia", "springsteen"])
.unwrap();
assert_eq!(5_632, ac.memory_usage());
let ac = AhoCorasick::builder()
.kind(None) // default
.ascii_case_insensitive(true)
.build(&["foobar", "bruce", "triskaidekaphobia", "springsteen"])
.unwrap();
assert_eq!(11_136, ac.memory_usage());
let ac = AhoCorasick::builder()
.kind(Some(AhoCorasickKind::NoncontiguousNFA))
.ascii_case_insensitive(true)
.build(&["foobar", "bruce", "triskaidekaphobia", "springsteen"])
.unwrap();
assert_eq!(10_879, ac.memory_usage());
let ac = AhoCorasick::builder()
.kind(Some(AhoCorasickKind::ContiguousNFA))
.ascii_case_insensitive(true)
.build(&["foobar", "bruce", "triskaidekaphobia", "springsteen"])
.unwrap();
assert_eq!(2_584, ac.memory_usage());
let ac = AhoCorasick::builder()
.kind(Some(AhoCorasickKind::DFA))
.ascii_case_insensitive(true)
.build(&["foobar", "bruce", "triskaidekaphobia", "springsteen"])
.unwrap();
// While this shows the DFA being the biggest here by a small margin,
// don't let the difference fool you. With such a small number of
// patterns, the difference is small, but a bigger number of patterns
// will reveal that the rate of growth of the DFA is far bigger than
// the NFAs above. For a large number of patterns, it is easy for the
// DFA to take an order of magnitude more heap space (or more!).
assert_eq!(11_136, ac.memory_usage());
Trait Implementations§
Source§impl Deref for ALLOWED_MATCHER
impl Deref for ALLOWED_MATCHER
Source§type Target = AhoCorasick
type Target = AhoCorasick
Source§fn deref(&self) -> &AhoCorasick
fn deref(&self) -> &AhoCorasick
impl LazyStatic for ALLOWED_MATCHER
Auto Trait Implementations§
impl Freeze for ALLOWED_MATCHER
impl RefUnwindSafe for ALLOWED_MATCHER
impl Send for ALLOWED_MATCHER
impl Sync for ALLOWED_MATCHER
impl Unpin for ALLOWED_MATCHER
impl UnwindSafe for ALLOWED_MATCHER
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self
into a Left
variant of Either<Self, Self>
if into_left
is true
.
Converts self
into a Right
variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self
into a Left
variant of Either<Self, Self>
if into_left(&self)
returns true
.
Converts self
into a Right
variant of Either<Self, Self>
otherwise. Read more