Tags: ROCm/rocPRIM
Tags
[rocPRIM][hipCUB][rocThrust][rocRAND] CP Versioning and changelog upd… …ates for 7.2 release (#3331) ## Motivation Changelog entries, internal version numbers, and rocPRIM/rocRAND dependency release branches need to be updated for the 7.2 release. ## Technical Details Updates the items mentioned above. Note that hipRAND has not noteworthy changes for 7.2. ## Test Plan Run a build, make sure there are no cmake errors. View the changelogs to make sure there are no formatting errors. ## Test Result No build issues. ## Submission Checklist --------- Co-authored-by: Wayne Franz <wayfranz@amd.com> Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>
[rocPRIM][hipCUB][rocThrust][rocRAND] CP Versioning and changelog upd… …ates for 7.2 release (#3331) ## Motivation Changelog entries, internal version numbers, and rocPRIM/rocRAND dependency release branches need to be updated for the 7.2 release. ## Technical Details Updates the items mentioned above. Note that hipRAND has not noteworthy changes for 7.2. ## Test Plan Run a build, make sure there are no cmake errors. View the changelogs to make sure there are no formatting errors. ## Test Result No build issues. ## Submission Checklist --------- Co-authored-by: Wayne Franz <wayfranz@amd.com> Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>
[rocPRIM][hipCUB][rocThrust][rocRAND] CP Versioning and changelog upd… …ates for 7.2 release (#3331) ## Motivation Changelog entries, internal version numbers, and rocPRIM/rocRAND dependency release branches need to be updated for the 7.2 release. ## Technical Details Updates the items mentioned above. Note that hipRAND has not noteworthy changes for 7.2. ## Test Plan Run a build, make sure there are no cmake errors. View the changelogs to make sure there are no formatting errors. ## Test Result No build issues. ## Submission Checklist --------- Co-authored-by: Wayne Franz <wayfranz@amd.com> Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>
[rocPRIM][hipCUB][rocThrust][rocRAND] CP Versioning and changelog upd… …ates for 7.2 release (#3331) ## Motivation Changelog entries, internal version numbers, and rocPRIM/rocRAND dependency release branches need to be updated for the 7.2 release. ## Technical Details Updates the items mentioned above. Note that hipRAND has not noteworthy changes for 7.2. ## Test Plan Run a build, make sure there are no cmake errors. View the changelogs to make sure there are no formatting errors. ## Test Result No build issues. ## Submission Checklist --------- Co-authored-by: Wayne Franz <wayfranz@amd.com> Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>
[rocPRIM][hipCUB][rocThrust][rocRAND] CP Versioning and changelog upd… …ates for 7.2 release (#3331) ## Motivation Changelog entries, internal version numbers, and rocPRIM/rocRAND dependency release branches need to be updated for the 7.2 release. ## Technical Details Updates the items mentioned above. Note that hipRAND has not noteworthy changes for 7.2. ## Test Plan Run a build, make sure there are no cmake errors. View the changelogs to make sure there are no formatting errors. ## Test Result No build issues. ## Submission Checklist --------- Co-authored-by: Wayne Franz <wayfranz@amd.com> Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>
cherry-pick to 7.1.1: feat(rocprim): add environment variable switch …
…to enable atomic block id (#2271)
DO NOT MERGE UNTIL PM APPROVAL GIVEN.
In some cases, using an atomic block ID is desired. This PR introduces a
way to enable/disable this via an environment variable.
This change is quite significant. And requires changes to most lookback
scan algorithms to reduce logical complexity in handling kernel
variants.
The changes are as following:
* In `device/detail/ordered_block_id`: introduce
`check_if_using_atomic_block_id`.
* This is runtime host logic to decide to check if atomic block id is
desired.
* Change multiple algorithms to use `std::variant` and `std::visit` to
encapsulate and abstract multiple compile-time kernel variants.
* In `detail/various.hpp`: introduce `detail::constexpr_type_variant`
and `detail::constexpr_value_variant` to help generate variant types.
* We use some variadic folding magic to generate switch to select
compile time variants on run time. Idem to:
```cpp
if (use_a) {
f<true>();
} else {
f<false>();
}
```
Only now store the intermediate result as a `std::integral_constant`.
```cpp
if (use_a) {
my_variant = std::integral_constant<bool, true>{};
} else {
my_variant = std::integral_constant<bool, true>{};
}
```
But of course writing this out by hand is annoying, hence the folding
expression.
* Simplify sleepy lookback scan variant logic in various algorithms.
- Verify that disabling atomics work by launching with
`ROCPRIM_USE_ATOMIC_BLOCK_ID=0`.
- Verify that force-enabling atomics works by launching with
`ROCPRIM_USE_ATOMIC_BLOCK_ID=2`.
- Verify default behaviour by launching with no environment variables
and with `ROCPRIM_USE_ATOMIC_BLOCK_ID=1`.
Debug compile when targeting many device architectures may result in
linker errors due to symbols being to far apart.
- [x] Look over the contributing guidelines at
https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.
---------
Co-authored-by: Nara <nara@streamhpc.com>
Co-authored-by: Cenxuan Tian <cenxuan@streamhpc.com>
Co-authored-by: Nick Breed <nick@streamhpc.com>
docs(rocprim): add missing docs to transform output iterator (#1775) This PR adds missing docs for transform_output_iterator. Co-authored-by: Nara Prasetya <nara@streamhpc.com>
[rocm-libraries] ROCm/rocm-libraries#1401 (commit 609fac3) Add cstring header include to texture_cache_iterator Do not merge in this PR until given PM approval as it is targeting ROCm 7.0.1. ## Motivation If a user includes the texture_cache_iterator header in some way without also including cstring, then compilation will fail. ## Technical Details In https://github.com/ROCm/rocm-libraries/blob/develop/projects/rocprim/rocprim/include/rocprim/iterator/texture_cache_iterator.hpp#L178 rocprim references memset, which requires the cstring header. It is not included in the texture_cache_iterator header, leading to the compilation failure. ## Test Plan Even an empty program that includes the rocprim header will fail to compile as the main rocprim header includes texture_cache_iterator.hpp: ``` #include <rocprim/rocprim.hpp> int main() { return 0; } ``` ## Test Result The above compiles with this fix. ## Submission Checklist - [ ] Look over the contributing guidelines at https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.
[rocm-libraries] ROCm/rocm-libraries#1036 (commit 539ba71) [rocPRIM][Code Coverage][Cherry Pick] Increase code coverage for rocPRIM (#1036) In this commit we have added test coverage, made some bug fixes and added a wrapper function for a common pattern in the tests. This `test_kernel_wrapper` function reduces the amount of code used. In ROCm/rocm-libraries@0318f42 we have added coverage of the `warp_scan_shuffle` by forcing dpp off in cmake on the `warp_scan` test. In ROCm/rocm-libraries@30d54c1 we have added the missing test cases for the thread_algos in the thread directory. (Except for thread_operators, these functions are mostly duplicates probably needs to be cleanly moved/deprecated). We also found a bug in `thread_reduce` with the tests and fixed this. In ROCm/rocm-libraries@48780ff ROCm/rocm-libraries@37aa542 ROCm/rocm-libraries@05a9e12 ROCm/rocm-libraries@f5f1dbe ROCm/rocm-libraries@343e275 ROCm/rocm-libraries@8007fe6 we have added unit tests for the files in the types directory. In ROCm/rocm-libraries@40ca3e2 we have introduced the `test_kernel_wrapper` and added the missing `PartitionTwoWayFlag` test. In ROCm/rocm-libraries@7a4e7ba unit tests for the rocprim::tuple type are added. In ROCm/rocm-libraries@29eac9a we have added extra test coverage by introducing a new type that will go into the untested path. We have also added this type to the benchmark to show [this specialization](ROCm/rocm-libraries@29eac9a#diff-22a70b2ad081732e222004ded43ce0db7145ff196f7b723289425dcb5b6c732dR228) is still needed. We also changed the `std::is_integral` to `rocprim::is_integral` to not include `(u)int128_t` for this specialization, this does not impact performance. The specialization enable_if was also slightly changed to make it clearer which path is chosen (does not make a difference in actual executed code). Also custom config and a iterator was added to the tests of merge_sort. In ROCm/rocm-libraries@f5614d5 test coverage was added for the device_scan_common.hpp file. In ROCm/rocm-libraries@1d2a40a test coverage was increased by actually using `const fixed_array` and including a `Level` type which goes into the base path of sample_to_bin_even struct. We also added an iterator as a type to the test. Also the new test wrapper was used. In ROCm/rocm-libraries@c518f42 test coverage was increased for the iterators, a lot of the operators where missing. Also some cleaning up of the tests was done including the wrapper when possible. There was also a bug found for the comperator in `arg_index_iterator` and `texture_cache_iterator`, which was also fixed. The `->` operator is not tested for `test_texture_cache`, `arg_index_iterator`, `transform_iterator` and `zip_iterator`. They currently do not seem to work, I am working on a possible fix but will be added in a later PR.
[rocm-libraries] ROCm/rocm-libraries#1036 (commit 539ba71) [rocPRIM][Code Coverage][Cherry Pick] Increase code coverage for rocPRIM (#1036) In this commit we have added test coverage, made some bug fixes and added a wrapper function for a common pattern in the tests. This `test_kernel_wrapper` function reduces the amount of code used. In ROCm/rocm-libraries@0318f42 we have added coverage of the `warp_scan_shuffle` by forcing dpp off in cmake on the `warp_scan` test. In ROCm/rocm-libraries@30d54c1 we have added the missing test cases for the thread_algos in the thread directory. (Except for thread_operators, these functions are mostly duplicates probably needs to be cleanly moved/deprecated). We also found a bug in `thread_reduce` with the tests and fixed this. In ROCm/rocm-libraries@48780ff ROCm/rocm-libraries@37aa542 ROCm/rocm-libraries@05a9e12 ROCm/rocm-libraries@f5f1dbe ROCm/rocm-libraries@343e275 ROCm/rocm-libraries@8007fe6 we have added unit tests for the files in the types directory. In ROCm/rocm-libraries@40ca3e2 we have introduced the `test_kernel_wrapper` and added the missing `PartitionTwoWayFlag` test. In ROCm/rocm-libraries@7a4e7ba unit tests for the rocprim::tuple type are added. In ROCm/rocm-libraries@29eac9a we have added extra test coverage by introducing a new type that will go into the untested path. We have also added this type to the benchmark to show [this specialization](ROCm/rocm-libraries@29eac9a#diff-22a70b2ad081732e222004ded43ce0db7145ff196f7b723289425dcb5b6c732dR228) is still needed. We also changed the `std::is_integral` to `rocprim::is_integral` to not include `(u)int128_t` for this specialization, this does not impact performance. The specialization enable_if was also slightly changed to make it clearer which path is chosen (does not make a difference in actual executed code). Also custom config and a iterator was added to the tests of merge_sort. In ROCm/rocm-libraries@f5614d5 test coverage was added for the device_scan_common.hpp file. In ROCm/rocm-libraries@1d2a40a test coverage was increased by actually using `const fixed_array` and including a `Level` type which goes into the base path of sample_to_bin_even struct. We also added an iterator as a type to the test. Also the new test wrapper was used. In ROCm/rocm-libraries@c518f42 test coverage was increased for the iterators, a lot of the operators where missing. Also some cleaning up of the tests was done including the wrapper when possible. There was also a bug found for the comperator in `arg_index_iterator` and `texture_cache_iterator`, which was also fixed. The `->` operator is not tested for `test_texture_cache`, `arg_index_iterator`, `transform_iterator` and `zip_iterator`. They currently do not seem to work, I am working on a possible fix but will be added in a later PR.
PreviousNext