Skip to content

Commit 853ee1a

Browse files
mgoinminpeter
authored andcommitted
[Bugfix][V1] Fix FlashInfer V1 backend using the wrong VllmConfig (vllm-project#18086)
Signed-off-by: minpeter <[email protected]>
1 parent 3824880 commit 853ee1a

File tree

1 file changed

+2
-3
lines changed

1 file changed

+2
-3
lines changed

vllm/v1/attention/backends/flashinfer.py

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -14,8 +14,7 @@
1414
from vllm.attention.backends.abstract import (AttentionBackend, AttentionImpl,
1515
AttentionType)
1616
from vllm.attention.layer import Attention
17-
from vllm.config import (VllmConfig, get_current_vllm_config,
18-
get_layers_from_vllm_config)
17+
from vllm.config import VllmConfig, get_layers_from_vllm_config
1918
from vllm.logger import init_logger
2019
from vllm.v1.attention.backends.flash_attn import use_cascade_attention
2120
from vllm.v1.attention.backends.utils import CommonAttentionMetadata
@@ -215,7 +214,7 @@ def __init__(self, runner: GPUModelRunner, kv_cache_spec: AttentionSpec,
215214
# Global hyperparameters shared by all attention layers
216215
self.global_hyperparameters: Optional[PerLayerParameters] = None
217216

218-
self.vllm_config = get_current_vllm_config()
217+
self.vllm_config = runner.vllm_config
219218
self.kv_cache_spec = kv_cache_spec
220219
self.block_table = block_table
221220

0 commit comments

Comments
 (0)