Struct datafusion_common::config::ParquetOptions
source · #[non_exhaustive]pub struct ParquetOptions {Show 22 fields
pub enable_page_index: bool,
pub pruning: bool,
pub skip_metadata: bool,
pub metadata_size_hint: Option<usize>,
pub pushdown_filters: bool,
pub reorder_filters: bool,
pub data_pagesize_limit: usize,
pub write_batch_size: usize,
pub writer_version: String,
pub compression: Option<String>,
pub dictionary_enabled: Option<bool>,
pub dictionary_page_size_limit: usize,
pub statistics_enabled: Option<String>,
pub max_statistics_size: Option<usize>,
pub max_row_group_size: usize,
pub created_by: String,
pub column_index_truncate_length: Option<usize>,
pub data_page_row_count_limit: usize,
pub encoding: Option<String>,
pub bloom_filter_enabled: bool,
pub bloom_filter_fpp: Option<f64>,
pub bloom_filter_ndv: Option<u64>,
}
Expand description
Options related to parquet files
Fields (Non-exhaustive)§
This struct is marked as non-exhaustive
Struct { .. }
syntax; cannot be matched against without a wildcard ..
; and struct update syntax will not work.enable_page_index: bool
If true, reads the Parquet data page level metadata (the Page Index), if present, to reduce the I/O and number of rows decoded.
pruning: bool
If true, the parquet reader attempts to skip entire row groups based on the predicate in the query and the metadata (min/max values) stored in the parquet file
skip_metadata: bool
If true, the parquet reader skip the optional embedded metadata that may be in the file Schema. This setting can help avoid schema conflicts when querying multiple parquet files with schemas containing compatible types but different metadata
metadata_size_hint: Option<usize>
If specified, the parquet reader will try and fetch the last size_hint
bytes of the parquet file optimistically. If not specified, two reads are required:
One read to fetch the 8-byte parquet footer and
another to fetch the metadata length encoded in the footer
pushdown_filters: bool
If true, filter expressions are be applied during the parquet decoding operation to reduce the number of rows decoded
reorder_filters: bool
If true, filter expressions evaluated during the parquet decoding operation will be reordered heuristically to minimize the cost of evaluation. If false, the filters are applied in the same order as written in the query
data_pagesize_limit: usize
Sets best effort maximum size of data page in bytes
write_batch_size: usize
Sets write_batch_size in bytes
writer_version: String
Sets parquet writer version valid values are “1.0” and “2.0”
compression: Option<String>
Sets default parquet compression codec Valid values are: uncompressed, snappy, gzip(level), lzo, brotli(level), lz4, zstd(level), and lz4_raw. These values are not case sensitive. If NULL, uses default parquet writer setting
dictionary_enabled: Option<bool>
Sets if dictionary encoding is enabled. If NULL, uses default parquet writer setting
dictionary_page_size_limit: usize
Sets best effort maximum dictionary page size, in bytes
statistics_enabled: Option<String>
Sets if statistics are enabled for any column Valid values are: “none”, “chunk”, and “page” These values are not case sensitive. If NULL, uses default parquet writer setting
max_statistics_size: Option<usize>
Sets max statistics size for any column. If NULL, uses default parquet writer setting
max_row_group_size: usize
Sets maximum number of rows in a row group
created_by: String
Sets “created by” property
column_index_truncate_length: Option<usize>
Sets column index trucate length
data_page_row_count_limit: usize
Sets best effort maximum number of rows in data page
encoding: Option<String>
Sets default encoding for any column Valid values are: plain, plain_dictionary, rle, bit_packed, delta_binary_packed, delta_length_byte_array, delta_byte_array, rle_dictionary, and byte_stream_split. These values are not case sensitive. If NULL, uses default parquet writer setting
bloom_filter_enabled: bool
Sets if bloom filter is enabled for any column
bloom_filter_fpp: Option<f64>
Sets bloom filter false positive probability. If NULL, uses default parquet writer setting
bloom_filter_ndv: Option<u64>
Sets bloom filter number of distinct values. If NULL, uses default parquet writer setting
Trait Implementations§
source§impl Clone for ParquetOptions
impl Clone for ParquetOptions
source§fn clone(&self) -> ParquetOptions
fn clone(&self) -> ParquetOptions
1.0.0 · source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source
. Read more