Add multi-LoRA support for Whisper models #28528

daje0601 · 2025-11-12T07:55:09Z

This commit enables multi-LoRA functionality for Whisper models by:

Adding SupportsLoRA interface to WhisperForConditionalGeneration
Defining embedding_modules mapping for decoder embeddings and output projection
Specifying embedding_padding_modules for proj_out layer
Setting supports_lora flag to True

Now Whisper models can use the same multi-LoRA capabilities as LLM models, allowing multiple LoRA adapters to be loaded and used simultaneously with --enable-lora and --lora-modules flags.

Purpose

Test Plan

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

This commit enables multi-LoRA functionality for Whisper models by: - Adding SupportsLoRA interface to WhisperForConditionalGeneration - Defining embedding_modules mapping for decoder embeddings and output projection - Specifying embedding_padding_modules for proj_out layer - Setting supports_lora flag to True Now Whisper models can use the same multi-LoRA capabilities as LLM models, allowing multiple LoRA adapters to be loaded and used simultaneously with --enable-lora and --lora-modules flags.

github-actions · 2025-11-12T07:55:16Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors.

You ask your reviewers to trigger select CI tests on top of fastcheck CI.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

If you have any questions, please reach out to us on Slack at https://slack.vllm.ai.

🚀

gemini-code-assist

Code Review

This pull request adds multi-LoRA support for Whisper models. The changes are straightforward and correctly implement the SupportsLoRA interface for the WhisperForConditionalGeneration model. The new class attributes embedding_modules and embedding_padding_modules are correctly defined to target the decoder's token embeddings and the output projection layer. The implementation follows the existing pattern for enabling LoRA in other models within the vLLM framework. The changes look good and I don't see any issues.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2025-11-12T07:58:48Z

vllm/model_executor/models/whisper.py

+    embedding_modules = {
+        "model.decoder.embed_tokens": "input_embeddings",
+        "proj_out": "output_embeddings",
+    }


Use base module name in embedding_modules mapping

The new LoRA configuration maps decoder embeddings under "model.decoder.embed_tokens". LoRA utils (e.g. LoRAModel.create_dummy_lora and the checkpoint validator) look up embedding modules by the last segment of the module path (module_name.split(".")[-1]). Because the key here includes the full dotted path, those lookups never match embed_tokens, so dummy LoRA weights are not created for the decoder embeddings and any LoRA checkpoint that contains …embed_tokens will be rejected as “unexpected modules.” Other models register keys like "embed_tokens" to avoid this mismatch. The key should use the base name ("embed_tokens") so the decoder embedding is correctly recognised by the LoRA machinery.

Useful? React with 👍 / 👎.

Change xformers requirement from specific build version (0.0.33+5d4b92a5.d20251029) to >=0.0.33 to allow installation from PyPI without requiring custom built wheels. The specific build version is not available on PyPI, causing installation failures. Using >=0.0.33 maintains compatibility while enabling easier setup.

mergify · 2025-11-13T06:49:08Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @daje0601.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

This script automatically patches installed vllm package to add multi-LoRA support for Whisper models by: - Adding SupportsLoRA import and interface - Adding required LoRA attributes (supports_lora, embedding_modules, etc.) - Creating backup of original file - Verifying the patch was successful Usage: python patch_whisper_lora.py

gemini-code-assist bot reviewed Nov 12, 2025

View reviewed changes

chatgpt-codex-connector bot reviewed Nov 12, 2025

View reviewed changes

jeejeelee requested a review from NickLucche November 12, 2025 08:24

mergify bot added ci/build nvidia labels Nov 13, 2025

github-project-automation bot added this to NVIDIA Nov 13, 2025

mergify bot added the needs-rebase label Nov 13, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Add multi-LoRA support for Whisper models #28528

Add multi-LoRA support for Whisper models #28528

daje0601 commented Nov 12, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Nov 12, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Nov 12, 2025

Uh oh!

mergify bot commented Nov 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Add multi-LoRA support for Whisper models #28528

Are you sure you want to change the base?

Add multi-LoRA support for Whisper models #28528

Conversation

daje0601 commented Nov 12, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

github-actions bot commented Nov 12, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

mergify bot commented Nov 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

daje0601 commented Nov 12, 2025 •

edited by github-actions bot

Loading