Lesson 17: __syncthreads() and Race Conditions
When you stage data into shared memory and then read it across different threads, a problem arises: the threads in a block do not run in exact lockstep. One thread may reach the read line before another thread has finished writing the slot it needs — that is a race condition, which returns a stale v
A group of people place books on a shared shelf, and then each needs to pick up a book that someone else placed. If you start picking before everyone has finished placing, some will reach for an empty spot. __syncthreads() is saying 'nobody picks up until everyone has placed.'
- __syncthreads()
- A barrier within a block: all threads stop at this point until everyone has arrived, and only then continue together.
- race condition
- When the result depends on the order in which threads reach a write/read. Here: reading from shared before the write has finished.
- barrier
- A wait point that all threads in the block must reach before anyone proceeds. It blocks races between a write and a read.
- barrier scope
- __syncthreads() works only within a single block. It has no effect across blocks, and must not be called in a divergent branch.