Lesson 18: Shared Memory Bank Conflicts
Shared memory is fast, but it is split into 32 parallel channels called banks. All 32 threads in a warp can read together in a single step — but only if each one targets a different bank. When two target the same bank, a bank conflict occurs and the hardware forces them to wait in line. In this less
Imagine 32 checkout lanes in a store and 32 customers. If each customer goes to a different lane, they all pay at once, fast. But if two customers crowd into the same lane, one waits for the other. Padding is like nudging the line diagonally so again every customer gets a lane to themselves.
- bank
- One of the 32 parallel memory units that make up shared memory. A word at index i belongs to bank i % 32.
- bank conflict
- When threads in the same warp access different addresses that fall in the same bank, the hardware serializes them (an n-way conflict costs n times the time).
- broadcast
- When all threads in the warp read the exact same shared address, the value is broadcast to all in one step — with no penalty.
- padding
- Adding a dummy column, tile[N][N+1], that shifts each row by one bank so that column access lands on 32 different banks.