Lesson 9: 2D Grids & Blocks
Matrices and images are naturally two-dimensional, so CUDA lets you organize threads in a 2D grid. Instead of a single x dimension, we have threadIdx.x and threadIdx.y, blockIdx.x and blockIdx.y, plus blockDim.x and blockDim.y. Each thread computes a row and a column: row = blockIdx.y * blockDim.y +
A theater with rows and seats. To find a seat by one running number, you count: each full row contributes width seats, so the seat is row times width plus the column number. Every usher (thread) knows exactly its own row and column.
- 2D grid
- Organizing threads in two dimensions (x and y), convenient for matrices and images where each element has a row and a column.
- row and col
- row comes from the y dimension (blockIdx.y, threadIdx.y) and col from the x dimension. Together they are the thread's 2D coordinate.
- row-major flatten
- Turning (row, col) into a flat index: idx = row * width + col, because memory is stored row after row.
- dim3
- A three-component type (x, y, z) describing block or grid size, for example dim3 block(16, 16).