Skip to content

Readiness probes for prefill and decode pods needed for accurate status #345

@namasl

Description

@namasl

Component

Helm Chart

Desired use case or feature

When standing up decode and prefill pods with quickstart/llmd-installer.sh, they lack an appropriate readiness probe. Large models take multiple minutes beyond "ready" before they can actually accept requests.

Proposed solution

Add readiness probes that ensure vLLM is ready to accept requests (like httpGet probe to /health), along with probes for any sidecars.

Alternatives

No response

Additional context or screenshots

No response

Metadata

Metadata

Assignees

Labels

chartRelated to the Helm Chart

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions