Expand description
APIs to read from ORC
Reading from ORC is essentially composed by:
- Identify the column type based on the file’s schema
- Read the stripe (or part of it in projection pushdown)
- For each column, select the relevant region of the stripe
- Attach an Iterator to the region
Modules§
- decode
- Contains different iterators that receive a reader (
std::io::Read
) and return values for each of ORC’s physical types (e.g. boolean). - decompress
- Contains
Decompressor
Structs§
- Column
- Helper struct used to access the streams associated to an ORC column.
Its main use
Column::get_stream
, to get a stream. - File
Metadata - The file’s metadata.
Functions§
- read_
metadata - read_
stripe_ column - Reads
column
from the stripe into aColumn
.scratch
becomes owned byColumn
, which you can recover viainto_inner
. - read_
stripe_ footer - Reads, decompresses and deserializes the stripe’s footer as
StripeFooter
usingscratch
as an intermediary memory region.