Skip to content

Commit

Permalink
Use SequentialReductionKernel for tree-reduction as well
Browse files Browse the repository at this point in the history
1. Renamed misspelled variable
2. If reduction_nelems is small, used SequentialReductionKernel
   for tree-reductions as it is done for atomic reduction
3. Tweak scaling down logic for moderately-sized number of elements
   to reduce.

   We should also use max_wg if the iter_nelems is very small (one),
   since choosing max_wg for large iter_nelems may lead to under-
   utilization of GPU.
  • Loading branch information
oleksandr-pavlyk committed Nov 2, 2023
1 parent 11ecba8 commit c742e79
Showing 1 changed file with 194 additions and 117 deletions.
Loading

0 comments on commit c742e79

Please sign in to comment.