orc-format 0.3.0

Unofficial implementation of Apache ORC spec in safe Rust
Documentation
# Read Apache ORC from Rust

[![test](https://github.com/DataEngineeringLabs/orc-format/actions/workflows/test.yml/badge.svg)](https://github.com/DataEngineeringLabs/orc-format/actions/workflows/test.yml)
[![codecov](https://codecov.io/gh/DataEngineeringLabs/orc-format/branch/main/graph/badge.svg?token=AgyTF60R3D)](https://codecov.io/gh/DataEngineeringLabs/orc-format)

Read [Apache ORC](https://orc.apache.org/) in Rust.

This repository is similar to [parquet2](https://github.com/jorgecarleitao/parquet2) and [Avro-schema](https://github.com/DataEngineeringLabs/avro-schema), providing a toolkit to:

* Read ORC files (proto structures)
* Read stripes (the conversion from proto metadata to memory regions)
* Decode stripes (the math of decode stripes into e.g. booleans, runs of RLE, etc.)

It currently reads the following (logical) types:

* booleans
* strings
* integers
* floats

What is not yet implemented:

* Snappy, LZO decompression
* RLE v2 `Patched Base` decoding
* RLE v1 decoding
* Utility functions to decode non-native logical types:
    * decimal
    * timestamp
    * struct
    * List
    * Union

## Run tests

```bash
python3 -m venv venv
venv/bin/pip install -U pip
venv/bin/pip install -U pyorc
venv/bin/python write.py
cargo test
```