You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When standing up decode and prefill pods with quickstart/llmd-installer.sh, they lack an appropriate readiness probe. Large models take multiple minutes beyond "ready" before they can actually accept requests.
Proposed solution
Add readiness probes that ensure vLLM is ready to accept requests (like httpGet probe to /health), along with probes for any sidecars.