docs: Add how-to guide for aligning LLM-as-Judge #2348
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Issue Link / Problem Description
Adds documentation and examples for aligning LLM-as-Judge evaluators with human expert judgments. This addresses a common challenge where LLM judges may not align well with human evaluations, leading to unreliable automated assessments.
Changes Made
docs/howtos/applications/align-llm-as-judge.md
examples/ragas_examples/judge_alignment/
evals.py
: Baseline and improved judge metrics with alignment measurement__init__.py
: Module initialization with main entry pointsTesting
How to Test
uv run python -m ragas_examples.judge_alignment
make serve-docs
and navigate to the new guide