unic_ucd_segment::word_break

Enum WordBreak

Source
pub enum WordBreak {
Show 22 variants CR, LF, Newline, Extend, ZWJ, RegionalIndicator, Format, Katakana, HebrewLetter, ALetter, SingleQuote, DoubleQuote, MidNumLet, MidLetter, MidNum, Numeric, ExtendNumLet, EBase, EModifier, GlueAfterZwj, EBaseGAZ, Other,
}
Expand description

Variants§

§

CR

U+000D CARRIAGE RETURN (CR)
§

LF

U+000A LINE FEED (LF)
§

Newline

U+000B LINE TABULATION
U+000C FORM FEED (FF)
U+0085 NEXT LINE (NEL)
U+2028 LINE SEPARATOR
U+2029 PARAGRAPH SEPARATOR
§

Extend

Grapheme_Extend = Yes, or
General_Category = Spacing_Mark
and not U+200D ZERO WIDTH JOINER (ZWJ)
§

ZWJ

U+200D ZERO WIDTH JOINER
§

RegionalIndicator

Regional_Indicator = Yes

This consists of the range:

U+1F1E6 REGIONAL INDICATOR SYMBOL LETTER A
..U+1F1FF REGIONAL INDICATOR SYMBOL LETTER Z
§

Format

General_Category = Format
and not U+200B ZERO WIDTH SPACE (ZWSP)
and not U+200C ZERO WIDTH NON-JOINER (ZWNJ)
and not U+200D ZERO WIDTH JOINER (ZWJ)
§

Katakana

Script = KATAKANA, or
any of the following:
U+3031 ( 〱 ) VERTICAL KANA REPEAT MARK
U+3032 ( 〲 ) VERTICAL KANA REPEAT WITH VOICED SOUND MARK
U+3033 ( 〳 ) VERTICAL KANA REPEAT MARK UPPER HALF
U+3034 ( 〴 ) VERTICAL KANA REPEAT WITH VOICED SOUND MARK UPPER HALF
U+3035 ( 〵 ) VERTICAL KANA REPEAT MARK LOWER HALF
U+309B ( ゛ ) KATAKANA-HIRAGANA VOICED SOUND MARK
U+309C ( ゜ ) KATAKANA-HIRAGANA SEMI-VOICED SOUND MARK
U+30A0 ( ゠ ) KATAKANA-HIRAGANA DOUBLE HYPHEN
U+30FC ( ー ) KATAKANA-HIRAGANA PROLONGED SOUND MARK
U+FF70 ( ー ) HALFWIDTH KATAKANA-HIRAGANA PROLONGED SOUND MARK
§

HebrewLetter

Script = Hebrew
and General_Category = Other_Letter
§

ALetter

Alphabetic = Yes, or
any of the following 36 characters:
U+02C2 ( ˂ ) MODIFIER LETTER LEFT ARROWHEAD
..U+02C5 ( ˅ ) MODIFIER LETTER DOWN ARROWHEAD
U+02D2 ( ˒ ) MODIFIER LETTER CENTRED RIGHT HALF RING
..U+02D7 ( ˗ ) MODIFIER LETTER MINUS SIGN
U+02DE ( ˞ ) MODIFIER LETTER RHOTIC HOOK
U+02DF ( ˟ ) MODIFIER LETTER CROSS ACCENT
U+02ED ( ˭ ) MODIFIER LETTER UNASPIRATED
U+02EF ( ˯ ) MODIFIER LETTER LOW DOWN ARROWHEAD
..U+02FF ( ˿ ) MODIFIER LETTER LOW LEFT ARROW
U+05F3 ( ׳ ) HEBREW PUNCTUATION GERESH
U+A720 ( ꜠ ) MODIFIER LETTER STRESS AND HIGH TONE
U+A721 ( ꜡ ) MODIFIER LETTER STRESS AND LOW TONE
U+A789 ( ꞉ ) MODIFIER LETTER COLON
U+A78A ( ꞊ ) MODIFIER LETTER SHORT EQUALS SIGN
U+AB5B ( ꭛ ) MODIFIER BREVE WITH INVERTED BREVE
and Ideographic = No
and Word_Break ≠ Katakana
and Line_Break ≠ Complex_Context (SA)
and Script ≠ Hiragana
and Word_Break ≠ Extend
and Word_Break ≠ Hebrew_Letter
§

SingleQuote

U+0027 ( ' ) APOSTROPHE
§

DoubleQuote

U+0022 ( " ) QUOTATION MARK
§

MidNumLet

U+002E ( . ) FULL STOP
U+2018 ( ‘ ) LEFT SINGLE QUOTATION MARK
U+2019 ( ’ ) RIGHT SINGLE QUOTATION MARK
U+2024 ( ․ ) ONE DOT LEADER
U+FE52 ( ﹒ ) SMALL FULL STOP
U+FF07 ( ' ) FULLWIDTH APOSTROPHE
U+FF0E ( . ) FULLWIDTH FULL STOP
§

MidLetter

U+00B7 ( · ) MIDDLE DOT
U+0387 ( · ) GREEK ANO TELEIA
U+05F4 ( ״ ) HEBREW PUNCTUATION GERSHAYIM
U+2027 ( ‧ ) HYPHENATION POINT
U+003A ( : ) COLON (used in Swedish)
U+FE13 ( ︓ ) PRESENTATION FORM FOR VERTICAL COLON
U+FE55 ( ﹕ ) SMALL COLON
U+FF1A ( : ) FULLWIDTH COLON
§

MidNum

Line_Break = Infix_Numeric, or
any of the following:
U+066C ( ٬ ) ARABIC THOUSANDS SEPARATOR
U+FE50 ( ﹐ ) SMALL COMMA
U+FE54 ( ﹔ ) SMALL SEMICOLON
U+FF0C ( , ) FULLWIDTH COMMA
U+FF1B ( ; ) FULLWIDTH SEMICOLON
and not U+003A ( : ) COLON
and not U+FE13 ( ︓ ) PRESENTATION FORM FOR VERTICAL COLON
and not U+002E ( . ) FULL STOP
§

Numeric

Line_Break = Numeric
and not U+066C ( ٬ ) ARABIC THOUSANDS SEPARATOR
§

ExtendNumLet

General_Category = Connector_Punctuation, or
U+202F NARROW NO-BREAK SPACE (NNBSP)
§

EBase

Emoji characters listed as Emoji_Modifier_Base=Yes in emoji-data.txt, which do not occur after ZWJ in emoji-zwj-sequences.txt.

See https://www.unicode.org/reports/tr51/.

§

EModifier

Emoji characters listed as Emoji_Modifer=Yes in emoji-data.txt.

See https://www.unicode.org/reports/tr51/.

§

GlueAfterZwj

Emoji characters that do not break from a previous ZWJ in a defined emoji ZWJ sequence, and are not listed as Emoji_Modifier_Base=Yes in emoji-data.txt.

See https://www.unicode.org/reports/tr51/.

§

EBaseGAZ

Emoji characters listed as Emoji_Modifer_Base=Yes in emoji_data.txt, and also occur after ZWJ in emoji-zwj-sequences.txt.

See https://www.unicode.org/reports/tr51/.

§

Other

All other characters

Implementations§

Source§

impl WordBreak

Source

pub fn of(ch: char) -> WordBreak

Find the character Word_Break property value.

Trait Implementations§

Source§

impl CharProperty for WordBreak

Source§

fn prop_abbr_name() -> &'static str

The abbreviated name of the property.
Source§

fn prop_long_name() -> &'static str

The long name of the property.
Source§

fn prop_human_name() -> &'static str

The human-readable name of the property.
Source§

impl Clone for WordBreak

Source§

fn clone(&self) -> WordBreak

Returns a copy of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for WordBreak

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl Default for WordBreak

Source§

fn default() -> Self

Returns the “default value” for a type. Read more
Source§

impl Display for WordBreak

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl EnumeratedCharProperty for WordBreak

Source§

fn all_values() -> &'static [WordBreak]

Exhaustive list of all property values.
Source§

fn abbr_name(&self) -> &'static str

The abbreviated name of the property value.
Source§

fn long_name(&self) -> &'static str

The long name of the property value.
Source§

fn human_name(&self) -> &'static str

The human-readable name of the property value.
Source§

impl FromStr for WordBreak

Source§

type Err = ()

The associated error which can be returned from parsing.
Source§

fn from_str(s: &str) -> Result<Self, Self::Err>

Parses a string s to return a value of this type. Read more
Source§

impl Hash for WordBreak

Source§

fn hash<__H: Hasher>(&self, state: &mut __H)

Feeds this value into the given Hasher. Read more
1.3.0 · Source§

fn hash_slice<H>(data: &[Self], state: &mut H)
where H: Hasher, Self: Sized,

Feeds a slice of this type into the given Hasher. Read more
Source§

impl PartialEq for WordBreak

Source§

fn eq(&self, other: &WordBreak) -> bool

Tests for self and other values to be equal, and is used by ==.
1.0.0 · Source§

fn ne(&self, other: &Rhs) -> bool

Tests for !=. The default implementation is almost always sufficient, and should not be overridden without very good reason.
Source§

impl TotalCharProperty for WordBreak

Source§

fn of(ch: char) -> Self

The property value for the character.
Source§

impl Copy for WordBreak

Source§

impl Eq for WordBreak

Source§

impl StructuralPartialEq for WordBreak

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dst: *mut T)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dst. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> PartialCharProperty for T

Source§

fn of(ch: char) -> Option<T>

The property value for the character, or None.
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.