Algorithm documentation.
As I don't understand algorithm fully myself,
I'll just document some parts which I do understand.
# Shard
- Reed-Solomon `GF(2^16)` erasure coding works on 16-bit elements ([`GfElement`]).
- A **shard** is a byte-array which is interpreted as an array of [`GfElement`]:s.
A naive implementation could e.g. require shards to be a multiple of **2 bytes**
and then interpret each byte-pair as low/high parts of a single [`GfElement`]:
```text
[ low_0, high_0, low_1, high_1, ...]
```
However that approach isn't good for SIMD optimizations.
Instead shards are required to be a multiple of **64 bytes**.
In each 64-byte block first 32 bytes are low parts of 32 [`GfElement`]:s
and last 32 bytes are high parts of those 32 [`GfElement`]:s.
```text
[ low_0, low_1, ..., low_31, high_0, high_1, ..., high_31 ]
```
A shard then consists of one or more of these 64-byte blocks:
```text
```
## Original shards and recovery shards
- The data which is going to be protected by Reed-Solomon erasure coding
is split into equal-sized **original shards**.
- **`original_count`** is the number of original shards.
- Additional **recovery shards** of same size are then created
which contain recovery data so that original data can be fully restored
from any set of **`original_count`** shards, original or recovery.
- **`recovery_count`** is the number of recovery shards.
Algorithm supports any combination of
1 - 32768 original shards with 1 - 32768 recovery shards.
Up to 65535 original or recovery shards is also possible with following limitations:
| `<= 2^16 - 2^n` | `<= 2^n` |
| `<= 61440` | `<= 4096` |
| `<= 57344` | `<= 8192` |
| `<= 49152` | `<= 16384` |
| **`<= 32768`** | **`<= 32768`** |
| `<= 16384` | `<= 49152` |
| `<= 8192` | `<= 57344` |
| `<= 4096` | `<= 61440` |
| `<= 2^n` | `<= 2^16 - 2^n` |
# Rate
Encoding and decoding both have two variations:
- **High rate** refers to having more original shards than recovery shards.
- High rate must be used when there are over 32768 original shards.
- High rate encoding uses **chunks** of `recovery_count.next_power_of_two()` shards.
- **Low rate** refers to having more recovery shards than original shards.
- Low rate must be used when there are over 32768 recovery shards.
- Low rate encoding uses **chunks** of `original_count.next_power_of_two()` shards.
- Because of padding either rate can be used when there are
at most 32768 original shards and at most 32768 recovery shards.
- High rate and low rate are not compatible with each other,
i.e. decoding must use same rate that encoding used.
- With multiple chunks "correct" rate is generally faster in encoding
and not-slower in decoding.
- With single chunk "wrong" rate is generally faster in decoding
if `original_count` and `recovery_count` differ a lot.
## Benchmarks
- These benchmarks are from `cargo bench rate`
and use similar setup than [main benchmarks],
except with maximum possible shard loss.
| 1024 : 1024 | 1x 1024 | 175 MiB/s | 176 MiB/s | 76 MiB/s | 75 MiB/s |
| 1024 : 1025 (Low) | 2x 1024 | 140 | **153** | 47 | **59** |
| 1025 : 1024 (High) | 2x 1024 | **152** | 132 | **60** | 46 |
| 1024 : 2048 (Low) | 2x 1024 | 157 | **169** | 70 | 70 |
| 2048 : 1024 (High) | 2x 1024 | **167** | 151 | 69 | 68 |
| 1025 : 1025 | 1x 2048 | 125 | 126 | 44 | 43 |
| 1025 : 2048 (Low) | 1x 2048 | 144 | 144 | **65** **!!!** | 53 |
| 2048 : 1025 (High) | 1x 2048 | 144 | 145 | 53 | **62** **!!!** |
| 2048 : 2048 | 1x 2048 | 156 | 157 | 70 | 69 |
[main benchmarks]: crate#benchmarks
# Encoding
Encoding takes original shards as input and generates recovery shards.
## High rate encoding
- Encoding is done in **chunks** of `recovery_count.next_power_of_two()` shards.
- Original shards are split into chunks and last chunk
is padded with zero-filled shards if needed.
- Recovery shards fit into a single chunk
which is padded with unused shards if needed.
- Recovery chunk is generated with following algorithm
```text
recovery_chunk = FFT(
IFFT(original_chunk_0, skew_0) xor
IFFT(original_chunk_1, skew_1) xor
...
)
```
This is implemented in [`HighRateEncoder`].
## Low rate encoding
- Encoding is done in **chunks** of `original_count.next_power_of_two()` shards.
- Original shards fit into a single chunk
which is padded with zero-filled shards if needed.
- Recovery shards are generated in chunks and last chunk
is padded with unused shards if needed.
- Recovery chunks are generated with following algorithm
```text
recovery_chunk_0 = FFT( IFFT(original_chunk), skew_0 )
recovery_chunk_1 = FFT( IFFT(original_chunk), skew_1 )
...
```
This is implemented in [`LowRateEncoder`].
# Decoding
**TODO**
[`GfElement`]: crate::engine::GfElement
[`HighRateEncoder`]: crate::rate::HighRateEncoder
[`LowRateEncoder`]: crate::rate::LowRateEncoder