-
Notifications
You must be signed in to change notification settings - Fork 55
Open
Description
Currently, I have replaced the model address (hf://xxxxxxxx) with PVC (pvc://pvc_name/model_path), but it is not mounted into the started decode and prefill pods. Is there any other place that needs to be configured?
This is the yaml of my msvc
apiVersion: llm-d.ai/v1alpha1
kind: ModelService
metadata:
creationTimestamp: "2025-07-27T02:16:44Z"
generation: 1
labels:
app.kubernetes.io/managed-by: Sail
name: llm-3
namespace: admin-default
resourceVersion: "89810862"
uid: b2194e6d-d622-4050-978a-6cc4b29c5477
spec:
baseConfigMapRef:
name: basic-sim-preset
decode:
containers:
- args:
- --model
- /models
name: vllm
resources:
limits:
cpu: 1000m
memory: 1024Mi
requests:
cpu: 1000m
memory: 1024Mi
replicas: 1
decoupleScaling: false
endpointPicker:
containers:
- name: epp
resources:
limits:
cpu: 1000m
memory: 1024Mi
requests:
cpu: 1000m
memory: 1024Mi
replicas: 1
modelArtifacts:
uri: pvc://pvc-45690f60056e40698a99669d9d060543/Qwen2.5-VL-3B-Instruct
prefill:
containers:
- args:
- --model
- /models
name: vllm
resources:
limits:
cpu: 1000m
memory: 1024Mi
requests:
cpu: 1000m
memory: 1024Mi
replicas: 1
routing:
modelName: admin-default/Qwen2.5-VL-3B-Instruct
status:
conditions:
- lastTransitionTime: "2025-07-27T02:16:51Z"
message: Deployment has minimum availability.
reason: MinimumReplicasAvailable
status: "True"
type: PrefillAvailable
- lastTransitionTime: "2025-07-27T02:16:51Z"
message: ReplicaSet "llm-3-prefill-746cb9b4" has successfully progressed.
reason: NewReplicaSetAvailable
status: "True"
type: PrefillProgressing
- lastTransitionTime: "2025-07-27T02:16:51Z"
message: Deployment has minimum availability.
reason: MinimumReplicasAvailable
status: "True"
type: DecodeAvailable
- lastTransitionTime: "2025-07-27T02:16:51Z"
message: ReplicaSet "llm-3-decode-5bc7dfd998" has successfully progressed.
reason: NewReplicaSetAvailable
status: "True"
type: DecodeProgressing
- lastTransitionTime: "2025-07-27T02:16:56Z"
message: Deployment has minimum availability.
reason: MinimumReplicasAvailable
status: "True"
type: EppAvailable
- lastTransitionTime: "2025-07-27T02:16:56Z"
message: ReplicaSet "llm-3-epp-7569bd4c5c" has successfully progressed.
reason: NewReplicaSetAvailable
status: "True"
type: EppProgressing
decodeAvailable: 1
decodeDeploymentRef: llm-3-decode
decodeReady: 1/1
eppAvailable: 1
eppDeploymentRef: llm-3-epp
eppReady: 1/1
eppRoleBinding: llm-3-epp-rolebinding
inferenceModelRef: llm-3
inferencePoolRef: llm-3-inference-pool
prefillAvailable: 1
prefillDeploymentRef: llm-3-prefill
prefillReady: 1/1
prefillServiceAccountRef: llm-3-epp-sa
Metadata
Metadata
Assignees
Labels
No labels