init-container not passed HF_TOKEN using example from documentation

/kind bug

**What steps did you take and what happened:**
[In the kserve documentation](https://kserve.github.io/website/0.14/modelserving/storage/huggingface/hf/#option-2-use-environment-variable-with-secret-ref), one example for authenticating with huggingface, is to add the secret to the InferenceService (full manifest below, it's identical to the documentation)
```
      env:
        - name: HF_TOKEN  # Option 2 for authenticating with HF_TOKEN
          valueFrom:
            secretKeyRef:
              name: hf-secret
              key: HF_TOKEN
              optional: false
```
this passes the HF_TOKEN to the `kserve-container`, but not the `storage-initializer`, leading to the following error in the `storage-initializer` container

```
Access to model meta-llama/Meta-Llama-3-8B-Instruct is restricted. You must have access to it and be authenticated to access it. Please log in.
```
(note that I have created the secret, and it is used by the `kserve-container` as expected)

I can resolve by adding
```
spec:
  container:
    env:
    - name: HF_TOKEN
      valueFrom:
        secretKeyRef:
          key: HF_TOKEN
          name: hf-secret
          optional: true

```
to the `ClusterStorageContainer`, in which case the deployment works, but this isn't mentioned in the documentation (and isn't configurable for end users)

**What's the InferenceService yaml:**
Copied from the [kserve documentation](https://kserve.github.io/website/0.14/modelserving/storage/huggingface/hf/#option-2-use-environment-variable-with-secret-ref)
```
kind: InferenceService
metadata:
  name: huggingface-llama3
spec:
  predictor:
    model:
      modelFormat:
        name: huggingface
      args:
        - --model_name=llama3
        - --model_dir=/mnt/models
      storageUri: hf://meta-llama/meta-llama-3-8b-instruct
      resources:
        limits:
          cpu: "6"
          memory: 24Gi
          nvidia.com/gpu: "1"
        requests:
          cpu: "6"
          memory: 24Gi
          nvidia.com/gpu: "1"
      env:
        - name: HF_TOKEN  # Option 2 for authenticating with HF_TOKEN
          valueFrom:
            secretKeyRef:
              name: hf-secret
              key: HF_TOKEN
              optional: false
```


**Environment:**
- KServe Version: `0.15.0` (originally on `0.14.1`, but tested after upgrading)

Possible I'm missing something obvious here, apologies if so!

(btw, the `meta-llama/Meta-Llama-3-8B-Instruct` model used in the documentation also seems to require the user to manaully request access in huggingface, it may be worth putting a note about this in the docs)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

init-container not passed HF_TOKEN using example from documentation #482

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

init-container not passed HF_TOKEN using example from documentation #482

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions