Prompt Engineering & Evals

This lesson covers prompt engineering techniques and building an eval harness to test response quality.

Like a writer trying several opening-paragraph versions — prompt engineering tests different phrasings to get the best AI response.

Few-shot: Including examples in the prompt to help the LLM understand the desired format.
Chain-of-thought: Guiding the LLM to think step by step before answering — improves accuracy.
Eval harness: An automated system that runs prompts against test cases and measures quality.
LLM-as-judge: Using one LLM to score the responses of another LLM.