Lesson 5: 'Performance Mode' — The Two Switches Before You Run

Still at the intuition level, almost no code. Before running a model 'for real', you flip it into 'performance mode' — with two separate switches. One switch makes the answers stable; the other saves memory and speeds up the run. We'll understand what each switch does and why both are needed, and ge

One switch tells the model 'we're performing, not rehearsing' — no random drills. The other says 'don't take notes for later fixing' — no learning notebook.

Behavior switch (eval): Puts the model into stable run mode — turns off randomness like dropout, so the same input always gives the same answer.
Learning switch (no_grad / inference_mode): Stops keeping 'learning notes' (gradients). In inference there's no learning, so it saves memory and speeds up.