unicode_bidi::utf16

Struct BidiInfo

source
pub struct BidiInfo<'text> {
    pub text: &'text [u16],
    pub original_classes: Vec<BidiClass>,
    pub levels: Vec<Level>,
    pub paragraphs: Vec<ParagraphInfo>,
}
Expand description

Bidi information of the text (UTF-16 version).

The original_classes and levels vectors are indexed by code unit offsets into the text. If a character is multiple code units wide, then its class and level will appear multiple times in these vectors.

Fields§

§text: &'text [u16]

The text

§original_classes: Vec<BidiClass>

The BidiClass of the character at each byte in the text.

§levels: Vec<Level>

The directional embedding level of each byte in the text.

§paragraphs: Vec<ParagraphInfo>

The boundaries and paragraph embedding level of each paragraph within the text.

TODO: Use SmallVec or similar to avoid overhead when there are only one or two paragraphs? Or just don’t include the first paragraph, which always starts at 0?

Implementations§

source§

impl<'text> BidiInfo<'text>

source

pub fn new(text: &[u16], default_para_level: Option<Level>) -> BidiInfo<'_>

Split the text into paragraphs and determine the bidi embedding levels for each paragraph.

The hardcoded-data Cargo feature (enabled by default) must be enabled to use this.

TODO: In early steps, check for special cases that allow later steps to be skipped. like text that is entirely LTR. See the nsBidi class from Gecko for comparison.

TODO: Support auto-RTL base direction

source

pub fn new_with_data_source<'a, D: BidiDataSource>( data_source: &D, text: &'a [u16], default_para_level: Option<Level>, ) -> BidiInfo<'a>

Split the text into paragraphs and determine the bidi embedding levels for each paragraph, with a custom BidiDataSource for Bidi data. If you just wish to use the hardcoded Bidi data, please use BidiInfo::new() instead (enabled with tbe default hardcoded-data Cargo feature).

TODO: In early steps, check for special cases that allow later steps to be skipped. like text that is entirely LTR. See the nsBidi class from Gecko for comparison.

TODO: Support auto-RTL base direction

source

pub fn reordered_levels( &self, para: &ParagraphInfo, line: Range<usize>, ) -> Vec<Level>

Produce the levels for this paragraph as needed for reordering, one level per byte in the paragraph. The returned vector includes bytes that are not included in the line, but will not adjust them.

This runs Rule L1, you can run Rule L2 by calling Self::reorder_visual(). If doing so, you may prefer to use Self::reordered_levels_per_char() instead to avoid non-byte indices.

For an all-in-one reordering solution, consider using Self::reorder_visual().

source

pub fn reordered_levels_per_char( &self, para: &ParagraphInfo, line: Range<usize>, ) -> Vec<Level>

Produce the levels for this paragraph as needed for reordering, one level per character in the paragraph. The returned vector includes characters that are not included in the line, but will not adjust them.

This runs Rule L1, you can run Rule L2 by calling Self::reorder_visual(). If doing so, you may prefer to use Self::reordered_levels_per_char() instead to avoid non-byte indices.

For an all-in-one reordering solution, consider using Self::reorder_visual().

source

pub fn reorder_line( &self, para: &ParagraphInfo, line: Range<usize>, ) -> Cow<'text, [u16]>

Re-order a line based on resolved levels and return the line in display order.

This does not apply Rule L3 or Rule L4 around combining characters or mirroring.

source

pub fn reorder_visual(levels: &[Level]) -> Vec<usize>

Reorders pre-calculated levels of a sequence of characters.

NOTE: This is a convenience method that does not use a Paragraph object. It is intended to be used when an application has determined the levels of the objects (character sequences) and just needs to have them reordered.

the index map will result in indexMap[visualIndex]==logicalIndex.

This only runs Rule L2 as it does not have information about the actual text.

Furthermore, if levels is an array that is aligned with code units, bytes within a codepoint may be reversed. You may need to fix up the map to deal with this. Alternatively, only pass in arrays where each Level is for a single code point.

§# Example
use unicode_bidi::BidiInfo;
use unicode_bidi::Level;

let l0 = Level::from(0);
let l1 = Level::from(1);
let l2 = Level::from(2);

let levels = vec![l0, l0, l0, l0];
let index_map = BidiInfo::reorder_visual(&levels);
assert_eq!(levels.len(), index_map.len());
assert_eq!(index_map, [0, 1, 2, 3]);

let levels: Vec<Level> = vec![l0, l0, l0, l1, l1, l1, l2, l2];
let index_map = BidiInfo::reorder_visual(&levels);
assert_eq!(levels.len(), index_map.len());
assert_eq!(index_map, [0, 1, 2, 6, 7, 5, 4, 3]);
source

pub fn visual_runs( &self, para: &ParagraphInfo, line: Range<usize>, ) -> (Vec<Level>, Vec<LevelRun>)

Find the level runs within a line and return them in visual order.

line is a range of bytes indices within levels.

The first return value is a vector of levels used by the reordering algorithm, i.e. the result of Rule L1. The second return value is a vector of level runs, the result of Rule L2, showing the visual order that each level run (a run of text with the same level) should be displayed. Within each run, the display order can be checked against the Level vector.

This does not handle Rule L3 (combining characters) or Rule L4 (mirroring), as that should be handled by the engine using this API.

Conceptually, this is the same as running Self::reordered_levels() followed by Self::reorder_visual(), however it returns the result as a list of level runs instead of producing a level map, since one may wish to deal with the fact that this is operating on byte rather than character indices.

http://www.unicode.org/reports/tr9/#Reordering_Resolved_Levels

source

pub fn has_rtl(&self) -> bool

If processed text has any computed RTL levels

This information is usually used to skip re-ordering of text when no RTL level is present

Trait Implementations§

source§

impl<'text> Debug for BidiInfo<'text>

source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
source§

impl<'text> PartialEq for BidiInfo<'text>

source§

fn eq(&self, other: &BidiInfo<'text>) -> bool

Tests for self and other values to be equal, and is used by ==.
1.0.0 · source§

fn ne(&self, other: &Rhs) -> bool

Tests for !=. The default implementation is almost always sufficient, and should not be overridden without very good reason.
source§

impl<'text> StructuralPartialEq for BidiInfo<'text>

Auto Trait Implementations§

§

impl<'text> Freeze for BidiInfo<'text>

§

impl<'text> RefUnwindSafe for BidiInfo<'text>

§

impl<'text> Send for BidiInfo<'text>

§

impl<'text> Sync for BidiInfo<'text>

§

impl<'text> Unpin for BidiInfo<'text>

§

impl<'text> UnwindSafe for BidiInfo<'text>

Blanket Implementations§

source§

impl<T> Any for T
where T: 'static + ?Sized,

source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
source§

impl<T> Borrow<T> for T
where T: ?Sized,

source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
source§

impl<T> From<T> for T

source§

fn from(t: T) -> T

Returns the argument unchanged.

source§

impl<T, U> Into<U> for T
where U: From<T>,

source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

source§

type Error = Infallible

The type returned in the event of a conversion error.
source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.