Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Use SequentialReductionKernel for tree-reduction as well
1. Renamed misspelled variable 2. If reduction_nelems is small, used SequentialReductionKernel for tree-reductions as it is done for atomic reduction 3. Tweak scaling down logic for moderately-sized number of elements to reduce. We should also use max_wg if the iter_nelems is very small (one), since choosing max_wg for large iter_nelems may lead to under- utilization of GPU.
- Loading branch information