pub fn partition(columns: &[ArrayRef]) -> Result<Partitions, ArrowError>
Expand description
Given a list of lexicographically sorted columns, computes the Partitions
,
where a partition consists of the set of consecutive rows with equal values
Returns an error if no columns are specified or all columns do not have the same number of rows.
Example:
For example, given columns x
, y
and z
, calling
lexicographical_partition_ranges(values, (x, y))
will divide the
rows into ranges where the values of (x, y)
are equal:
┌ ─ ┬───┬ ─ ─┌───┐─ ─ ┬───┬ ─ ─ ┐
│ 1 │ │ 1 │ │ A │ Range: 0..1 (x=1, y=1)
├ ─ ┼───┼ ─ ─├───┤─ ─ ┼───┼ ─ ─ ┤
│ 1 │ │ 2 │ │ B │
│ ├───┤ ├───┤ ├───┤ │
│ 1 │ │ 2 │ │ C │ Range: 1..4 (x=1, y=2)
│ ├───┤ ├───┤ ├───┤ │
│ 1 │ │ 2 │ │ D │
├ ─ ┼───┼ ─ ─├───┤─ ─ ┼───┼ ─ ─ ┤
│ 2 │ │ 1 │ │ E │ Range: 4..5 (x=2, y=1)
├ ─ ┼───┼ ─ ─├───┤─ ─ ┼───┼ ─ ─ ┤
│ 3 │ │ 1 │ │ F │ Range: 5..6 (x=3, y=1)
└ ─ ┴───┴ ─ ─└───┘─ ─ ┴───┴ ─ ─ ┘
x y z partition(&[x, y])
Example Code
let batch = RecordBatch::try_from_iter(vec![
("x", Arc::new(Int64Array::from(vec![1, 1, 1, 1, 2, 3])) as ArrayRef),
("y", Arc::new(Int64Array::from(vec![1, 2, 2, 2, 1, 1])) as ArrayRef),
("z", Arc::new(StringArray::from(vec!["A", "B", "C", "D", "E", "F"])) as ArrayRef),
]).unwrap();
// Partition on first two columns
let ranges = partition(&batch.columns()[..2]).unwrap().ranges();
let expected = vec![
(0..1),
(1..4),
(4..5),
(5..6),
];
assert_eq!(ranges, expected);