Crate lexical_core
source ·Expand description
Fast lexical conversion routines for a no_std
environment.
lexical-core is a low-level API for number-to-string and
string-to-number conversions, without requiring a system
allocator. If you would like to use a high-level API that
writes to and parses from String
and &str
, respectively,
please look at lexical
instead.
Despite the low-level API and focus on performance, lexical-core strives to be simple and yet configurable: despite supporting nearly every float and integer format available, it only exports 4 write functions and 4 parse functions.
lexical-core is well-tested, and has been downloaded more than 5 million times and currently has no known errors in correctness. lexical-core prioritizes performance above all else, and aims to be competitive or faster than any other float or integer parser and writer.
In addition, despite having a large number of features, configurability, and a focus on performance, we also strive for fast compile times. Recent versions also add support for smaller binary sizes, as well ideal for embedded or web environments, where executable bloat can be much more detrimental than performance.
§Getting Started
// String to number using Rust slices.
// The argument is the byte string parsed.
let f: f32 = lexical_core::parse(b"3.5").unwrap(); // 3.5
let i: i32 = lexical_core::parse(b"15").unwrap(); // 15
// All lexical_core parsers are checked, they validate the
// input data is entirely correct, and stop parsing when invalid data
// is found, or upon numerical overflow.
let r = lexical_core::parse::<u8>(b"256"); // Err(ErrorCode::Overflow.into())
let r = lexical_core::parse::<u8>(b"1a5"); // Err(ErrorCode::InvalidDigit.into())
// In order to extract and parse a number from a substring of the input
// data, use `parse_partial`. These functions return the parsed value and
// the number of processed digits, allowing you to extract and parse the
// number in a single pass.
let r = lexical_core::parse_partial::<i8>(b"3a5"); // Ok((3, 1))
// If an insufficiently long buffer is passed, the serializer will panic.
// PANICS
let mut buf = [b'0'; 1];
//let slc = lexical_core::write::<i64>(15, &mut buf);
// In order to guarantee the buffer is long enough, always ensure there
// are at least `T::FORMATTED_SIZE` bytes, which requires the
// `lexical_core::FormattedSize` trait to be in scope.
use lexical_core::FormattedSize;
let mut buf = [b'0'; f64::FORMATTED_SIZE];
let slc = lexical_core::write::<f64>(15.1, &mut buf);
assert_eq!(slc, b"15.1");
// When the `radix` feature is enabled, for decimal floats, using
// `T::FORMATTED_SIZE` may significantly overestimate the space
// required to format the number. Therefore, the
// `T::FORMATTED_SIZE_DECIMAL` constants allow you to get a much
// tighter bound on the space required.
let mut buf = [b'0'; f64::FORMATTED_SIZE_DECIMAL];
let slc = lexical_core::write::<f64>(15.1, &mut buf);
assert_eq!(slc, b"15.1");
§Conversion API
Write
From String
§Features
In accordance with the Rust ethos, all features are additive: the crate
may be build with --all-features
without issue. The following features
are enabled by default:
std
write-integers
write-floats
parse-integers
parse-floats
A complete description of supported features includes:
§std
Enable use of the standard library. Currently, the standard library is not used for any functionality, and may be disabled without any change in functionality on stable.
§write-integers
Enable support for writing integers to string.
§write-floats
Enable support for writing floating-point numbers to string.
§parse-integers
Enable support for parsing integers from string.
§parsing-floats
Enable support for parsing floating-point numbers from string.
§format
Adds support for the entire format API (using NumberFormatBuilder
).
This allows extensive configurability for parsing and writing numbers
in custom formats, with different valid syntax requirements.
For example, in JSON, the following floats are valid or invalid:
-1 // valid
+1 // invalid
1 // valid
1. // invalid
.1 // invalid
0.1 // valid
nan // invalid
inf // invalid
Infinity // invalid
All of the finite numbers are valid in Rust, and Rust provides constants for non-finite floats. In order to parse standard-conforming JSON floats using lexical, you may use the following approach:
use lexical_core::{format, parse_with_options, ParseFloatOptions, Result};
fn parse_json_float<Bytes: AsRef<[u8]>>(bytes: Bytes) -> Result<f64> {
let options = ParseFloatOptions::new();
parse_with_options::<_, { format::JSON }>(bytes.as_ref(), &options)
}
See the Number Format section below for more information.
§power-of-two
Enable doing numeric conversions to and from strings with power-of-two radixes. This avoids most of the overhead and binary bloat of the radix feature, while enabling support for the most commonly-used radixes.
§radix
Enable doing numeric conversions to and from strings for all radixes.
This requires substantially more static storage than power-of-two
,
and increases compile times by a fair amount, but can be quite useful
for esoteric programming languages which use duodecimal floats, for
example.
§compact
Reduce the generated code size at the cost of performance. This minimizes the number of static tables, inlining, and generics used, drastically reducing the size of the generated binaries.
§safe
This replaces most unchecked indexing, required in cases where the compiler cannot ellide the check, with checked indexing. However, it does not fully replace all unsafe behavior with safe behavior. To minimize the risk of UB and out-of-bounds reads/writers, extensive edge-cases, property-based tests, and fuzzing is done with both the safe feature enabled and disabled, with the tests verified by Miri and Valgrind.
§Configuration API
Lexical provides two main levels of configuration:
- The
NumberFormatBuilder
, creating a packed struct with custom formatting options. - The Options API.
§Number Format
The number format class provides numerous flags to specify
number parsing or writing. When the power-of-two
feature is
enabled, additional flags are added:
- The radix for the significant digits (default
10
). - The radix for the exponent base (default
10
). - The radix for the exponent digits (default
10
).
When the format
feature is enabled, numerous other syntax and
digit separator flags are enabled, including:
- A digit separator character, to group digits for increased legibility.
- Whether leading, trailing, internal, and consecutive digit separators are allowed.
- Toggling required float components, such as digits before the decimal point.
- Toggling whether special floats are allowed or are case-sensitive.
Many pre-defined constants therefore exist to simplify common use-cases, including:
JSON
,XML
,TOML
,YAML
,SQLite
, and many more.Rust
,Python
,C#
,FORTRAN
,COBOL
literals and strings, and many more.
§Options API
The Options API provides high-level options to specify number parsing or writing, options not intrinsically tied to a number format. For example, the Options API provides:
- The exponent character (default
b'e'
, orb'^'
). - The decimal point character (default
b'.'
). - Custom
NaN
,Infinity
string representations. - Whether to trim the fraction component from integral floats.
- The exponent break point for scientific notation.
- The maximum and minimum number of significant digits to write.
- The rounding mode when truncating significant digits while writing.
The available options are:
In addition, pre-defined constants for each category of options may be found in their respective modules.
§Example
An example of creating your own options to parse European-style numbers (which use commas as decimal points, and periods as digit separators) is as follows:
// This creates a format to parse a European-style float number.
// The decimal point is a comma, and the digit separators (optional)
// are periods.
const EUROPEAN: u128 = lexical_core::NumberFormatBuilder::new()
.digit_separator(b'.')
.build();
let options = lexical_core::ParseFloatOptions::builder()
.decimal_point(b',')
.build()
.unwrap();
assert_eq!(
lexical_core::parse_with_options::<f32, EUROPEAN>(b"300,10", &options),
Ok(300.10)
);
// Another example, using a pre-defined constant for JSON.
const JSON: u128 = lexical_core::format::JSON;
let options = lexical_core::ParseFloatOptions::new();
assert_eq!(
lexical_core::parse_with_options::<f32, JSON>(b"0e1", &options),
Ok(0.0)
);
assert_eq!(
lexical_core::parse_with_options::<f32, JSON>(b"1E+2", &options),
Ok(100.0)
);
§Algorithms
§Benchmarks
A comprehensive analysis of lexical commits and their performance can be found in benchmarks.
§Design
§Version Support
The minimum, standard, required version is 1.63.0, for const generic support. Older versions of lexical support older Rust versions.
§Safety
There is no non-trivial unsafe behavior in lexical itself,
however, any incorrect safety invariants in our parsers and writers
(lexical-parse-float
, lexical-parse-integer
, lexical-write-float
,
and lexical-write-integer
) could cause those safety invariants to
be broken.
Modules§
- Public API for the number format packed struct.
- Configuration options for parsing floats.
- Configuration options for parsing integers.
- Configuration options for writing floats.
- Configuration options for writing integers.
Structs§
- Build number format from specifications.
- Options to customize parsing floats.
- Builder for
Options
. - Immutable options to customize writing integers.
- Builder for
Options
. - Options to customize writing floats.
- Builder for
Options
. - Immutable options to customize writing integers.
- Builder for
Options
.
Enums§
- Error code during parsing, indicating failure type.
Constants§
- Maximum number of bytes required to serialize any number to string.
Traits§
- The size, in bytes, of formatted values.
- Trait for numerical types that can be parsed from bytes.
- Trait for numerical types that can be parsed from bytes with custom options.
- Shared trait for all parser options.
- Trait for numerical types that can be serialized to bytes.
- Trait for numerical types that can be serialized to bytes with custom options.
- Shared trait for all writer options.
Functions§
- Get the error type from the format packed struct.
- Determine if the format packed struct is valid.
- Parse complete number from string.
- Parse partial number from string.
- Parse partial number from string with custom parsing options.
- Parse complete number from string with custom parsing options.
- Write number to string.
- Write number to string with custom options.
Type Aliases§
- A specialized Result type for lexical operations.