The standard is not clear what should happen in cumulative_sum for 0-D inputs https://data-apis.org/array-api/latest/API_specification/generated/array_api.cumulative_sum.html#cumulative-sum
Note that NumPy and PyTorch have different conventions here:
>>> import numpy as np
>>> np.cumsum(np.asarray(0))
array([0])
>>> import torch
>>> torch.cumsum(torch.asarray(0), dim=0)
tensor(0)
torch.cumsum unconditionally requires the dim argument, whereas np.cumsum defaults to computing over a flattened array if axis=None. The standard requires axis if the dimensionality is greater than 1. However, axis=0 doesn't really make sense for a 0-D array. NumPy also allows specifying axis=0 and gives the same result:
>>> np.cumsum(np.asarray(0), axis=0)
array([0])
Furthermore, there is ambiguity here on what should happen for a 0-D input when include_initial=True. The standard says:
if include_initial is True, the returned array must have the same shape as x, except the size of the axis along which to compute the cumulative sum must be N+1.
If the result should be 0-D, then clearly include_initial must do nothing, since there is no way to increase the number of elements in the result.
This doesn't seem to have been discussed in the original pull request #653 or issue #597, and I don't recall it being brought up at the consortium meetings.
My suggested behavior would be
- The result of
cumulative_sum on a 0-D input x should be a 0-D output which is the same as x (i.e., it would work just like sum(x)). This matches the behavior that cumulative_sum always returns an output with the same dimensionality as the input.
- The
include_initial flag would do nothing when the input is 0-D. One can read the existing text as already supporting this behavior, since "the axis along which to compute the cumulative sum" is vacuous.
- The
axis argument must be None when the input is 0-D or else the result is an error. This matches the usual "axis must be in the range [-ndim, ndim)" condition, which is not currently spelled out this way for cumulative_sum but is for other functions in the standard.
Alternatively, we could leave the behavior unspecified. To me the above makes sense, but this does break with current cumsum conventions. On the other hand, since the name is different, it's not a big deal for libraries to change behavior between cumsum and cumulative_sum (this is at least the approach that NumPy has taken with some of the existing renames with breaking changes).
The standard is not clear what should happen in
cumulative_sumfor 0-D inputs https://data-apis.org/array-api/latest/API_specification/generated/array_api.cumulative_sum.html#cumulative-sumNote that NumPy and PyTorch have different conventions here:
torch.cumsumunconditionally requires thedimargument, whereasnp.cumsumdefaults to computing over a flattened array ifaxis=None. The standard requiresaxisif the dimensionality is greater than 1. However,axis=0doesn't really make sense for a 0-D array. NumPy also allows specifyingaxis=0and gives the same result:Furthermore, there is ambiguity here on what should happen for a 0-D input when
include_initial=True. The standard says:If the result should be 0-D, then clearly
include_initialmust do nothing, since there is no way to increase the number of elements in the result.This doesn't seem to have been discussed in the original pull request #653 or issue #597, and I don't recall it being brought up at the consortium meetings.
My suggested behavior would be
cumulative_sumon a 0-D inputxshould be a 0-D output which is the same asx(i.e., it would work just likesum(x)). This matches the behavior thatcumulative_sumalways returns an output with the same dimensionality as the input.include_initialflag would do nothing when the input is 0-D. One can read the existing text as already supporting this behavior, since "the axis along which to compute the cumulative sum" is vacuous.axisargument must beNonewhen the input is 0-D or else the result is an error. This matches the usual "axis must be in the range [-ndim, ndim)" condition, which is not currently spelled out this way forcumulative_sumbut is for other functions in the standard.Alternatively, we could leave the behavior unspecified. To me the above makes sense, but this does break with current
cumsumconventions. On the other hand, since the name is different, it's not a big deal for libraries to change behavior betweencumsumandcumulative_sum(this is at least the approach that NumPy has taken with some of the existing renames with breaking changes).