# Yore
A Rust library for decoding and encoding character sets based on OEM code pages.
[![yore at crates.io](https://img.shields.io/badge/crates.io-1.1.0-blue)](https://crates.io/crates/yore)
[![yore at docs.rs](https://docs.rs/yore/badge.svg)](https://docs.rs/yore)
# Features
* [Fast performance](https://bonega.github.io/yore-criterion/report/index.html) [*](#*-benchmarks)
* Minimal memory usage with `Cow` and `shrink_to_fit`
* Easy-to-use API
* Broad range of [supported code pages](#supported-code-pages)
* Handles code pages with redefined ASCII characters (<0x80), such as '٪' in CP864
# Usage
Add `yore` to your `Cargo.toml` file.
```toml
[dependencies]
yore = "1.1.0"
```
# Examples
## Using a specific code page
```rust
use yore::code_pages::{CP857, CP850};
use yore::{DecodeError, EncodeError};
let bytes = vec![116, 101, 120, 116];
let bytes_undefined = vec![116, 101, 120, 116, 32, 231];
assert_eq!(CP850.decode(&bytes), "text");
assert_eq!(CP857.decode(&bytes).unwrap(), "text");
assert!(matches!(CP857.decode(&bytes_undefined), DecodeError));
assert_eq!(CP857.decode_lossy(&bytes_undefined), "text �");
assert_eq!(CP850.encode("text").unwrap(), bytes);
assert!(matches!(CP850.encode("text 🦀"), EncodeError));
assert_eq!(CP850.encode_lossy("text 🦀", 231), bytes_undefined);
```
### Using a trait object
```rust
use yore::CodePage;
fn do_something(code_page: &dyn CodePage, bytes: &[u8]) {
println!("{}", code_page.decode(bytes).unwrap());
}
```
# Supported code pages
| Identifier | Name | Description |
|------------|-------------|---------------------------------------------------------------------------------------------|
| 437 | ibm437 | OEM United States |
| 737 | ibm737 | OEM Greek (formerly 437G); Greek (DOS) |
| 775 | ibm775 | OEM Baltic; Baltic (DOS) |
| 850 | ibm850 | OEM Multilingual Latin 1; Western European (DOS) |
| 852 | ibm852 | OEM Latin 2; Central European (DOS) |
| 855 | ibm855 | OEM Cyrillic (primarily Russian) |
| 857 | ibm857 | OEM Turkish; Turkish (DOS) |
| 860 | ibm860 | OEM Portuguese; Portuguese (DOS) |
| 861 | ibm861 | OEM Icelandic; Icelandic (DOS) |
| 862 | dos-862 | OEM Hebrew; Hebrew (DOS) |
| 863 | ibm863 | OEM French Canadian; French Canadian (DOS) |
| 864 | ibm864 | OEM Arabic; Arabic (864) |
| 865 | ibm865 | OEM Nordic; Nordic (DOS) |
| 866 | cp866 | OEM Russian; Cyrillic (DOS) |
| 869 | ibm869 | OEM Modern Greek; Greek, Modern (DOS) |
| 874 | windows-874 | Thai (Windows) |
| 910 | ibm910 | IBM-PC APL2
| 1250 | windows-1250| ANSI Central European; Central European (Windows) |
| 1251 | windows-1251| ANSI Cyrillic; Cyrillic (Windows) |
| 1252 | windows-1252| ANSI Latin 1; Western European (Windows) |
| 1253 | windows-1253| ANSI Greek; Greek (Windows) |
| 1254 | windows-1254| ANSI Turkish; Turkish (Windows) |
| 1255 | windows-1255| ANSI Hebrew; Hebrew (Windows) |
| 1256 | windows-1256| ANSI Arabic; Arabic (Windows) |
| 1257 | windows-1257| ANSI Baltic; Baltic (Windows) |
| 1258 | windows-1258| ANSI/OEM Vietnamese; Vietnamese (Windows) |
# * Benchmarks
`encoding_rs` supports only a few of the encodings that `oem_cp` and `yore` support. Additionally, `encoding_rs` focuses on streaming use cases.
Refer to the [bench crate](https://github.com/bonega/yore/blob/28198ff8d4e487a8f7e6a477fe7cbc19313618c0/benchmark/README.md) for more details.