Implements accumulation functions in dpctl.tensor#1602
Conversation
…t.cumulative_sum` The Python bindings for these functions are implemented in a new submodule `_tensor_accumulation_impl`
53e927d to
761ecd4
Compare
|
Deleted rendered PR docs from intelpython.github.com/dpctl, latest should be updated shortly. 🤞 |
|
Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_154 ran successfully. |
1 similar comment
|
Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_154 ran successfully. |
This resolves hangs in unique functions
|
Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_156 ran successfully. |
|
Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_158 ran successfully. |
… trailing axis Fixes a bug where in some cases output axes were not being permuted
|
Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_161 ran successfully. |
|
Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_163 ran successfully. |
|
Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_165 ran successfully. |
|
Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_167 ran successfully. |
|
Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_188 ran successfully. |
|
Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_189 ran successfully. |
|
Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_190 ran successfully. |
cc3d88f to
b88f8ee
Compare
|
Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_190 ran successfully. |
b88f8ee to
5fd506c
Compare
|
Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_191 ran successfully. |
Indexers are made const, integral variables in kernels made const too Make two-offset instances const references to avoid copying. Gor rid of get_src_const_ptr unused methods in stack_t structs. Replaced auto with size_t as appropriate. Added const to make compiler analysis easier (and faster).
|
Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_193 ran successfully. |
…ting against closed form
b1219af to
4bd02b4
Compare
|
Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_190 ran successfully. |
By returning data from `local_mem_acc` after the group barrier, if memory is later overwritten, a race condition follows, which was especially obvious on CPU Now the value is stored in variable before the barrier and then returned
…ase size of test_logcumsumexp_basic
|
Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_192 ran successfully. |
…ion (#1624) added comments explaining why barriers are needed
|
Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_193 ran successfully. |
Add empty new line after list item to make Sphinx happy.
Docstring edits for accumulation functions
oleksandr-pavlyk
left a comment
There was a problem hiding this comment.
LGMT! Thank you @ndgrigorian
|
Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_195 ran successfully. |
This pull request proposes implementation of
dpctl.tensor.cumulative_sum,dpctl.tensor.cumulative_prod, anddpctl.tensor.cumulative_logsumexp.cumulative_sumis already part of the array API standard and is implemented as per the spec.cumulative_prodandcumulative_logsumexpare implemented with a similar API, including aninclude_initialkeyword argument.