Merged
Conversation
|
View rendered docs @ https://intelpython.github.io/dpctl/pulls/837/index.html |
Collaborator
Contributor
Author
|
On my development Gen9 box this change speeds up the test suite run-time from 85 seconds to 54 seconds. Now: Before: |
71f40e4 to
7592c8b
Compare
7592c8b to
ba50f98
Compare
|
Deleted rendered PR docs from intelpython.github.com/dpctl, latest should be updated shortly. 🤞 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Improved efficiency of pybind11 classes and type casters.
Type casters in pybind11 generated with
PYBIND11_TYPE_CASTERmacro rely on default constructor of C++ type. The default constructed value is then overwritten byloadmethod. C++ types such assycl::deviceandsycl::queuedo non-trivial amount of work in default constructor. This work was ranking high in certain workloads (such asexample/pybind11/onemkl_gemv/sycl_timing_solver.py).Same applies to auto-generated type-casters for classes
dpctl::memory::usm_memoryanddpctl::tensor::usm_ndarray.This PR introduces DPCTL_TYPE_CASTER macro that defines
unique_ptrto the C++ type, rather than the type itself. This avoids superfluous call to default constructor altogether.Instead of redefining
pyobject_casterautomatically generated fordpctl::tensor::usm_ndarrayanddpctl::memory::usm_memorya singleton classdpctl::detail::dpctl_apiis added that creates objects used by default constructors of these classes.