A comparable row-oriented representation of a collection of [Array
].
[Row
]s are normalized for sorting, and can therefore be very efficiently compared,
using memcmp
under the hood, or used in non-comparison sorts such as radix sort.
This makes the row format ideal for implementing efficient multi-column sorting,
grouping, aggregation, windowing and more, as described in more detail
in this blog post.
For example, given three input [Array
], [RowConverter
] creates byte
sequences that compare the same as when using lexsort
.
┌─────┐ ┌─────┐ ┌─────┐
│ │ │ │ │ │
├─────┤ ┌ ┼─────┼ ─ ┼─────┼ ┐ ┏━━━━━━━━━━━━━┓
│ │ │ │ │ │ ─────────────▶┃ ┃
├─────┤ └ ┼─────┼ ─ ┼─────┼ ┘ ┗━━━━━━━━━━━━━┛
│ │ │ │ │ │
└─────┘ └─────┘ └─────┘
...
┌─────┐ ┌ ┬─────┬ ─ ┬─────┬ ┐ ┏━━━━━━━━┓
│ │ │ │ │ │ ─────────────▶┃ ┃
└─────┘ └ ┴─────┴ ─ ┴─────┴ ┘ ┗━━━━━━━━┛
UInt64 Utf8 F64
Input Arrays Row Format
(Columns)
[Rows
] must be generated by the same [RowConverter
] for the comparison
to be meaningful.
Basic Example
# use Arc;
# use ;
# use ;
# use ;
# use Int32Type;
# use DataType;
let a1 = new as ArrayRef;
let a2 = new as ArrayRef;
let arrays = vec!;
// Convert arrays to rows
let converter = new.unwrap;
let rows = converter.convert_columns.unwrap;
// Compare rows
for i in 0..4
assert_eq!;
// Convert rows back to arrays
let converted = converter.convert_rows.unwrap;
assert_eq!;
// Compare rows from different arrays
let a1 = new as ArrayRef;
let a2 = new as ArrayRef;
let arrays = vec!;
let rows2 = converter.convert_columns.unwrap;
assert!;
assert!;
// Convert selection of rows back to arrays
let selection = ;
let converted = converter.convert_rows.unwrap;
let c1 = converted.;
assert_eq!;
let c2 = converted.;
let c2_values: = c2.iter.flatten.collect;
assert_eq!;
Lexsort
The row format can also be used to implement a fast multi-column / lexicographic sort
# use ;
# use ;