Skip to content

How do I use PVC in msvc? #375

@GGGsk

Description

@GGGsk

Currently, I have replaced the model address (hf://xxxxxxxx) with PVC (pvc://pvc_name/model_path), but it is not mounted into the started decode and prefill pods. Is there any other place that needs to be configured?

This is the yaml of my msvc

apiVersion: llm-d.ai/v1alpha1
kind: ModelService
metadata:
  creationTimestamp: "2025-07-27T02:16:44Z"
  generation: 1
  labels:
    app.kubernetes.io/managed-by: Sail
  name: llm-3
  namespace: admin-default
  resourceVersion: "89810862"
  uid: b2194e6d-d622-4050-978a-6cc4b29c5477
spec:
  baseConfigMapRef:
    name: basic-sim-preset
  decode:
    containers:
    - args:
      - --model
      - /models
      name: vllm
      resources:
        limits:
          cpu: 1000m
          memory: 1024Mi
        requests:
          cpu: 1000m
          memory: 1024Mi
    replicas: 1
  decoupleScaling: false
  endpointPicker:
    containers:
    - name: epp
      resources:
        limits:
          cpu: 1000m
          memory: 1024Mi
        requests:
          cpu: 1000m
          memory: 1024Mi
    replicas: 1
  modelArtifacts:
    uri: pvc://pvc-45690f60056e40698a99669d9d060543/Qwen2.5-VL-3B-Instruct
  prefill:
    containers:
    - args:
      - --model
      - /models
      name: vllm
      resources:
        limits:
          cpu: 1000m
          memory: 1024Mi
        requests:
          cpu: 1000m
          memory: 1024Mi
    replicas: 1
  routing:
    modelName: admin-default/Qwen2.5-VL-3B-Instruct
status:
  conditions:
  - lastTransitionTime: "2025-07-27T02:16:51Z"
    message: Deployment has minimum availability.
    reason: MinimumReplicasAvailable
    status: "True"
    type: PrefillAvailable
  - lastTransitionTime: "2025-07-27T02:16:51Z"
    message: ReplicaSet "llm-3-prefill-746cb9b4" has successfully progressed.
    reason: NewReplicaSetAvailable
    status: "True"
    type: PrefillProgressing
  - lastTransitionTime: "2025-07-27T02:16:51Z"
    message: Deployment has minimum availability.
    reason: MinimumReplicasAvailable
    status: "True"
    type: DecodeAvailable
  - lastTransitionTime: "2025-07-27T02:16:51Z"
    message: ReplicaSet "llm-3-decode-5bc7dfd998" has successfully progressed.
    reason: NewReplicaSetAvailable
    status: "True"
    type: DecodeProgressing
  - lastTransitionTime: "2025-07-27T02:16:56Z"
    message: Deployment has minimum availability.
    reason: MinimumReplicasAvailable
    status: "True"
    type: EppAvailable
  - lastTransitionTime: "2025-07-27T02:16:56Z"
    message: ReplicaSet "llm-3-epp-7569bd4c5c" has successfully progressed.
    reason: NewReplicaSetAvailable
    status: "True"
    type: EppProgressing
  decodeAvailable: 1
  decodeDeploymentRef: llm-3-decode
  decodeReady: 1/1
  eppAvailable: 1
  eppDeploymentRef: llm-3-epp
  eppReady: 1/1
  eppRoleBinding: llm-3-epp-rolebinding
  inferenceModelRef: llm-3
  inferencePoolRef: llm-3-inference-pool
  prefillAvailable: 1
  prefillDeploymentRef: llm-3-prefill
  prefillReady: 1/1
  prefillServiceAccountRef: llm-3-epp-sa

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions