Feature Request: Add Support for Pydantic/Pydantic-AI Output Models in Python API

### Summary

Support structured output validation and transformation by integrating DocETL’s Python API with pydantic BaseModel output schemas.

### Motivation

Validating and post-processing LLM outputs is much more robust with explicit schemas, type-checking, and business logic enforcement. Pydantic models (and pydantic-ai’s output models) make it easy to define constraints, enforce allowed values, and encapsulate custom logic—all in Python code. This greatly improves reliability, debuggability, and ease of maintenance for downstream users.

### Proposal

- Allow users to pass a pydantic BaseMode as an output schema when defining DocETL tasks via the Python API.
- DocETL should parse the LLM output and validate/transform it using the provided model.
- On validation failure, users should be able to access detailed error messages or trigger fallback logic.
- Support advanced pydantic features like custom validators, default values, and business logic methods.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature Request: Add Support for Pydantic/Pydantic-AI Output Models in Python API #367

Summary

Motivation

Proposal

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature Request: Add Support for Pydantic/Pydantic-AI Output Models in Python API #367

Description

Summary

Motivation

Proposal

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions