Skip to content

Commit

Permalink
Adds note regarding the __syncwarp function call.
Browse files Browse the repository at this point in the history
  • Loading branch information
dipietrantonio committed Dec 9, 2024
1 parent fb21be4 commit 94df0b3
Showing 1 changed file with 9 additions and 0 deletions.
9 changes: 9 additions & 0 deletions src/form_beam.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -322,6 +322,15 @@ __global__ void vmBeamform_kernel(int nfine_chan,
workspace[thread_sm_idx + 4] = gpuCadd(workspace[thread_sm_idx + 4], workspace[(threadIdx.x + i) * 5 + 4]);
}
#ifdef __NVCC__
/* Cristian's note
This instruction is only needed for NVIDIA GPUs starting from the Volta architecture,
that introduces the independent thread scheduling option
(https://stackoverflow.com/questions/70987051/independent-thread-scheduling-since-volta).
In such architecture, threads within a warp can execute independently from one another and
one of them can "run ahead" of the other ones, possibly creating a race condition.
In AMD GPUs, and NVIDIA GPUs previous to Volta, this is not available. All threads in a warp
execute the same instruction in lockstep (or a no-op in thread diverging situation).
No thread can run ahead of others in the same warp. */
__syncwarp();
#endif
}
Expand Down

0 comments on commit 94df0b3

Please sign in to comment.