Module writer

Source
Expand description

§JSON Writer

This JSON writer converts Arrow RecordBatches into arrays of JSON objects or JSON formatted byte streams.

§Writing JSON formatted byte streams

To serialize RecordBatches into line-delimited JSON bytes, use LineDelimitedWriter:


let schema = Schema::new(vec![Field::new("a", DataType::Int32, false)]);
let a = Int32Array::from(vec![1, 2, 3]);
let batch = RecordBatch::try_new(Arc::new(schema), vec![Arc::new(a)]).unwrap();

// Write the record batch out as JSON
let buf = Vec::new();
let mut writer = arrow_json::LineDelimitedWriter::new(buf);
writer.write_batches(&vec![&batch]).unwrap();
writer.finish().unwrap();

// Get the underlying buffer back,
let buf = writer.into_inner();
assert_eq!(r#"{"a":1}
{"a":2}
{"a":3}
"#, String::from_utf8(buf).unwrap())

To serialize RecordBatches into a well formed JSON array, use ArrayWriter:

use arrow_schema::{DataType, Field, Schema};

let schema = Schema::new(vec![Field::new("a", DataType::Int32, false)]);
let a = Int32Array::from(vec![1, 2, 3]);
let batch = RecordBatch::try_new(Arc::new(schema), vec![Arc::new(a)]).unwrap();

// Write the record batch out as a JSON array
let buf = Vec::new();
let mut writer = arrow_json::ArrayWriter::new(buf);
writer.write_batches(&vec![&batch]).unwrap();
writer.finish().unwrap();

// Get the underlying buffer back,
let buf = writer.into_inner();
assert_eq!(r#"[{"a":1},{"a":2},{"a":3}]"#, String::from_utf8(buf).unwrap())

LineDelimitedWriter and ArrayWriter will omit writing keys with null values. In order to explicitly write null values for keys, configure a custom Writer by using a WriterBuilder to construct a Writer.

§Writing to serde_json JSON Objects

To serialize RecordBatches into an array of JSON objects you can reparse the resulting JSON string. Note that this is less efficient than using the Writer API.

let schema = Schema::new(vec![Field::new("a", DataType::Int32, false)]);
let a = Int32Array::from(vec![1, 2, 3]);
let batch = RecordBatch::try_new(Arc::new(schema), vec![Arc::new(a)]).unwrap();

// Write the record batch out as json bytes (string)
let buf = Vec::new();
let mut writer = arrow_json::ArrayWriter::new(buf);
writer.write_batches(&vec![&batch]).unwrap();
writer.finish().unwrap();
let json_data = writer.into_inner();

// Parse the string using serde_json
use serde_json::{Map, Value};
let json_rows: Vec<Map<String, Value>> = serde_json::from_reader(json_data.as_slice()).unwrap();
assert_eq!(
    serde_json::Value::Object(json_rows[1].clone()),
    serde_json::json!({"a": 2}),
);

Structs§

JsonArray
Produces JSON output as a single JSON array.
LineDelimited
Produces JSON output with one record per line.
Writer
A JSON writer which serializes RecordBatches to a stream of u8 encoded JSON objects.
WriterBuilder
JSON writer builder.

Traits§

JsonFormat
This trait defines how to format a sequence of JSON objects to a byte stream.

Type Aliases§

ArrayWriter
A JSON writer which serializes RecordBatches to JSON arrays.
LineDelimitedWriter
A JSON writer which serializes RecordBatches to newline delimited JSON objects.