← All Topics

Testing And Evaluation

📖 10 lessons🔧 1 workshop🚀 1 project⏱️ ~11 hours
Filter by rank:

📖Lessons

1
beginner📖 14 minlesson

Why Testing AI Systems is Different

Understand the unique challenges of testing AI and LLM applications

testingfundamentalschallengesphilosophy
2
beginner📖 15 minlesson

Unit Testing LLM Applications

Learn practical techniques for unit testing individual components of LLM applications

unit-testingmockingcomponentstesting
🔒
intermediate📖 16 minlessonPRO

Evaluation Metrics for LLMs

Learn how to measure LLM output quality with appropriate metrics

metricsevaluationqualitymeasurement
🔒
intermediate📖 15 minlessonPRO

Creating Test Datasets

Learn how to design comprehensive test datasets for LLM evaluation

datasetstest-casesedge-casesevaluation
🔒
intermediate📖 14 minlessonPRO

LLM-as-Judge Evaluation

Use LLMs to evaluate other LLM outputs at scale

llm-judgeautomated-evalscalingmetrics
🔒
intermediate📖 16 minlessonPRO

Human Evaluation

Design and implement effective human evaluation processes

human-evalrating-scalesrubricsquality
🔒
advanced📖 15 minlessonPRO

Integration Testing

Test complete LLM pipelines, RAG systems, and agents end-to-end

integrationpipelinesragagentse2e
🔒
advanced📖 25 minlessonPRO

Workshop: Building Test Suites

Hands-on workshop building a complete testing framework for LLM applications

workshophands-ontesting-frameworkautomationci-cd
🔒
advanced📖 16 minlessonPRO

Regression Testing

Prevent quality degradation and maintain consistent performance

regressionversioningbaselinescontinuous-eval
🔒
advanced📖 16 minlessonPRO

Production Monitoring

Monitor LLM quality and performance in real-time production environments

monitoringproductionobservabilityalertingmetrics