Filter by rank:
📖Lessons
1
beginner📖 14 minlesson
Why Testing AI Systems is Different
Understand the unique challenges of testing AI and LLM applications
testingfundamentalschallengesphilosophy
2
beginner📖 15 minlesson
Unit Testing LLM Applications
Learn practical techniques for unit testing individual components of LLM applications
unit-testingmockingcomponentstesting
🔒
intermediate📖 16 minlessonPRO
Evaluation Metrics for LLMs
Learn how to measure LLM output quality with appropriate metrics
metricsevaluationqualitymeasurement
🔒
intermediate📖 15 minlessonPRO
Creating Test Datasets
Learn how to design comprehensive test datasets for LLM evaluation
datasetstest-casesedge-casesevaluation
🔒
intermediate📖 14 minlessonPRO
LLM-as-Judge Evaluation
Use LLMs to evaluate other LLM outputs at scale
llm-judgeautomated-evalscalingmetrics
🔒
intermediate📖 16 minlessonPRO
Human Evaluation
Design and implement effective human evaluation processes
human-evalrating-scalesrubricsquality
🔒
advanced📖 15 minlessonPRO
Integration Testing
Test complete LLM pipelines, RAG systems, and agents end-to-end
integrationpipelinesragagentse2e
🔒
advanced📖 25 minlessonPRO
Workshop: Building Test Suites
Hands-on workshop building a complete testing framework for LLM applications
workshophands-ontesting-frameworkautomationci-cd
🔒
advanced📖 16 minlessonPRO
Regression Testing
Prevent quality degradation and maintain consistent performance
regressionversioningbaselinescontinuous-eval
🔒
advanced📖 16 minlessonPRO
Production Monitoring
Monitor LLM quality and performance in real-time production environments
monitoringproductionobservabilityalertingmetrics