arrow_json::reader

Function infer_json_schema

Source
pub fn infer_json_schema<R: BufRead>(
    reader: R,
    max_read_records: Option<usize>,
) -> Result<(Schema, usize), ArrowError>
Expand description

Infer the fields of a JSON file by reading the first n records of the buffer, with max_read_records controlling the maximum number of records to read.

If max_read_records is not set, the whole file is read to infer its field types.

Returns inferred schema and number of records read.

This function will not seek back to the start of the reader. The user has to manage the original file’s cursor. This function is useful when the reader’s cursor is not available (does not implement Seek), such is the case for compressed streams decoders.

§Examples

use std::fs::File;
use std::io::{BufReader, SeekFrom, Seek};
use flate2::read::GzDecoder;
use arrow_json::reader::infer_json_schema;

let mut file = File::open("test/data/mixed_arrays.json.gz").unwrap();

// file's cursor's offset at 0
let mut reader = BufReader::new(GzDecoder::new(&file));
let inferred_schema = infer_json_schema(&mut reader, None).unwrap();
// cursor's offset at end of file

// seek back to start so that the original file is usable again
file.seek(SeekFrom::Start(0)).unwrap();