Skip to content

Commit 52247f4

Browse files
committed
[OpenVINO] Fix regression from vllm-project#8346
Signed-off-by: Peter Salas <[email protected]>
1 parent 2094062 commit 52247f4

File tree

2 files changed

+12
-2
lines changed

2 files changed

+12
-2
lines changed

.buildkite/run-openvino-test.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,4 +11,4 @@ trap remove_docker_container EXIT
1111
remove_docker_container
1212

1313
# Run the image and launch offline inference
14-
docker run --network host --env VLLM_OPENVINO_KVCACHE_SPACE=1 --name openvino-test openvino-test python3 /workspace/vllm/examples/offline_inference.py
14+
docker run --network host --env VLLM_OPENVINO_KVCACHE_SPACE=1 --name openvino-test openvino-test python3 /workspace/examples/offline_inference.py

vllm/attention/backends/openvino.py

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,13 @@
11
from dataclasses import dataclass
2-
from typing import List, Tuple, Type
2+
from typing import Dict, List, Optional, Tuple, Type
33

44
import openvino as ov
55
import torch
66

77
from vllm.attention.backends.abstract import (AttentionBackend,
88
AttentionMetadata)
99
from vllm.attention.backends.utils import CommonAttentionState
10+
from vllm.multimodal import MultiModalPlaceholderMap
1011

1112

1213
def copy_cache_block(src_tensor: ov.Tensor, dst_tensor: ov.Tensor,
@@ -128,3 +129,12 @@ class OpenVINOAttentionMetadata:
128129
# Shape: scalar
129130
# Type: i32
130131
max_context_len: torch.Tensor
132+
133+
# The index maps that relate multi-modal embeddings to the corresponding
134+
# placeholders.
135+
#
136+
# N.B. These aren't really related to attention and don't belong on this
137+
# type -- this is just a temporary solution to make them available to
138+
# `model_executable`.
139+
multi_modal_placeholder_index_maps: Optional[Dict[
140+
str, MultiModalPlaceholderMap.IndexMap]]

0 commit comments

Comments
 (0)