This RFC proposes adding support to the array API specification for finding the indices where elements should be inserted in order to maintain order.
Overview
Based on array comparison data, the API is available across all considered libraries.
Furthermore, all considered libraries support the side keyword argument, and all considered libraries, except for TensorFlow, support the sorter keyword argument.
JAX supports an additional kwarg, method, which is used based on device/size performance optimization considerations. PyTorch and TensorFlow support specifying the output data type, but differ in naming conventions.
All array libraries support one-dimensional arrays. PyTorch and TensorFlow generalize to any n-dimensional ndarray (stacking).
Prior Art
Proposal
def searchsorted(x1: array, x2: array, /, *, side="left", sorter=None)
- x1: one-dimensional array. If
sorter is None, x1 must be sorted in ascending order.
- x2: one-dimensional array.
- side: if "left", the returned index
i satisfies x1[i-1] < x2[j] <= x1[i]. Otherwise, if "right", the returned index i satisfies x1[i-1] <= x2[j] < x1[i]. If no suitable index, then i is either 0 or N, respectively, were N is the length of x1.
- sorter: array of integer indices that sort
x1 in ascending order (e.g., as might be produced via argsort).
Questions
- Should the API be extended to support stacking as in PyTorch/TensorFlow?
- Should the API support a scalar value for
x2? NumPy, PyTorch, JAX, Dask support scalars. CuPy and TensorFlow do not.
This RFC proposes adding support to the array API specification for finding the indices where elements should be inserted in order to maintain order.
Overview
Based on array comparison data, the API is available across all considered libraries.
Furthermore, all considered libraries support the
sidekeyword argument, and all considered libraries, except for TensorFlow, support thesorterkeyword argument.JAX supports an additional kwarg,
method, which is used based on device/size performance optimization considerations. PyTorch and TensorFlow support specifying the output data type, but differ in naming conventions.All array libraries support one-dimensional arrays. PyTorch and TensorFlow generalize to any n-dimensional ndarray (stacking).
Prior Art
Proposal
sorterisNone,x1must be sorted in ascending order.isatisfiesx1[i-1] < x2[j] <= x1[i]. Otherwise, if "right", the returned indexisatisfiesx1[i-1] <= x2[j] < x1[i]. If no suitable index, theniis either0orN, respectively, wereNis the length ofx1.x1in ascending order (e.g., as might be produced viaargsort).Questions
x2? NumPy, PyTorch, JAX, Dask support scalars. CuPy and TensorFlow do not.