-
-
Notifications
You must be signed in to change notification settings - Fork 9.8k
Closed
Description
Deploying vllm on k8s For service stability, healthy check is configured
-
TCP check
-
real llm request
-
healthy check api
-
tcp is a little simple
-
real request relative resource consumption
Can we add a simple calculation to provide check as follows
@app.get("/healthz")
async def health_check():
"""Health check"""
# a simple compute
if it works, I would like to contribute this pr
Metadata
Metadata
Assignees
Labels
No labels