regex_automata::util

Module primitives

source
Expand description

Lower level primitive types that are useful in a variety of circumstances.

§Overview

This list represents the principle types in this module and briefly describes when you might want to use them.

  • PatternID - A type that represents the identifier of a regex pattern. This is probably the most widely used type in this module (which is why it’s also re-exported in the crate root).
  • StateID - A type the represents the identifier of a finite automaton state. This is used for both NFAs and DFAs, with the notable exception of the hybrid NFA/DFA. (The hybrid NFA/DFA uses a special purpose “lazy” state identifier.)
  • SmallIndex - The internal representation of both a PatternID and a StateID. Its purpose is to serve as a type that can index memory without being as big as a usize on 64-bit targets. The main idea behind this type is that there are many things in regex engines that will, in practice, never overflow a 32-bit integer. (For example, like the number of patterns in a regex or the number of states in an NFA.) Thus, a SmallIndex can be used to index memory without peppering as casts everywhere. Moreover, it forces callers to handle errors in the case where, somehow, the value would otherwise overflow either a 32-bit integer or a usize (e.g., on 16-bit targets).
  • NonMaxUsize - Represents a usize that cannot be usize::MAX. As a result, Option<NonMaxUsize> has the same size in memory as a usize. This useful, for example, when representing the offsets of submatches since it reduces memory usage by a factor of 2. It is a legal optimization since Rust guarantees that slices never have a length that exceeds isize::MAX.

Structs§