pub fn infer_json_schema<R: BufRead>(
reader: R,
max_read_records: Option<usize>,
) -> Result<(Schema, usize), ArrowError>
Expand description
Infer the fields of a JSON file by reading the first n records of the buffer, with
max_read_records
controlling the maximum number of records to read.
If max_read_records
is not set, the whole file is read to infer its field types.
Returns inferred schema and number of records read.
This function will not seek back to the start of the reader
. The user has to manage the
original file’s cursor. This function is useful when the reader
’s cursor is not available
(does not implement Seek
), such is the case for compressed streams decoders.
§Examples
use std::fs::File;
use std::io::{BufReader, SeekFrom, Seek};
use flate2::read::GzDecoder;
use arrow_json::reader::infer_json_schema;
let mut file = File::open("test/data/mixed_arrays.json.gz").unwrap();
// file's cursor's offset at 0
let mut reader = BufReader::new(GzDecoder::new(&file));
let inferred_schema = infer_json_schema(&mut reader, None).unwrap();
// cursor's offset at end of file
// seek back to start so that the original file is usable again
file.seek(SeekFrom::Start(0)).unwrap();