Crate icu_properties
source ·Expand description
Definitions of Unicode Properties and APIs for retrieving property data in an appropriate data structure.
This module is published as its own crate (icu_properties
)
and as part of the icu
crate. See the latter for more details on the ICU4X project.
APIs that return a CodePointSetData
exist for binary properties and certain enumerated
properties. See the sets
module for more details.
APIs that return a CodePointMapData
exist for certain enumerated properties. See the
maps
module for more details.
§Examples
§Property data as CodePointSetData
s
use icu::properties::{maps, sets, GeneralCategory};
// A binary property as a `CodePointSetData`
assert!(sets::emoji().contains('🎃')); // U+1F383 JACK-O-LANTERN
assert!(!sets::emoji().contains('木')); // U+6728
// An individual enumerated property value as a `CodePointSetData`
let line_sep_data = maps::general_category()
.get_set_for_value(GeneralCategory::LineSeparator);
let line_sep = line_sep_data.as_borrowed();
assert!(line_sep.contains32(0x2028));
assert!(!line_sep.contains32(0x2029));
§Property data as CodePointMapData
s
use icu::properties::{maps, Script};
assert_eq!(maps::script().get('🎃'), Script::Common); // U+1F383 JACK-O-LANTERN
assert_eq!(maps::script().get('木'), Script::Han); // U+6728
Re-exports§
pub use PropertiesError as Error;
Modules§
- This module exposes tooling for running the unicode bidi algorithm using ICU4X data.
- Data and APIs for supporting specific Bidi properties data in an efficient structure.
- This module provides APIs for getting exemplar characters for a locale.
- The functions in this module return a
CodePointMapData
representing, for each code point in the entire range of code points, the property values for a particular Unicode property. - Module for working with the names of property values
- 🚧 [Unstable] Data provider struct definitions for this ICU4X component.
- Data and APIs for supporting both Script and Script_Extensions property values in an efficient structure.
- The functions in this module return a
CodePointSetData
containing the set of characters with a particular Unicode property.
Structs§
- Enumerated property Bidi_Class
- Property Canonical_Combining_Class. See UAX #15: https://www.unicode.org/reports/tr15/.
- Enumerated property East_Asian_Width.
- Groupings of multiple General_Category property values.
- Enumerated property Grapheme_Cluster_Break.
- Enumerated property Hangul_Syllable_Type
- Property Indic_Syllabic_Category. See UAX #44: https://www.unicode.org/reports/tr44/#Indic_Syllabic_Category.
- Enumerated property Joining_Type. See Section 9.2, Arabic Cursive Joining in The Unicode Standard for the summary of each property value.
- Enumerated property Line_Break.
- Enumerated property Script.
- Enumerated property Sentence_Break. See “Default Sentence Boundary Specification” in UAX #29 for the summary of each property value: https://www.unicode.org/reports/tr29/#Default_Word_Boundaries.
- Enumerated property Word_Break.
Enums§
- Enumerated property General_Category.
- A list of error outcomes for various operations in this module.