Skip to content

Conversation

@daje0601
Copy link

@daje0601 daje0601 commented Nov 12, 2025

This commit enables multi-LoRA functionality for Whisper models by:

  • Adding SupportsLoRA interface to WhisperForConditionalGeneration
  • Defining embedding_modules mapping for decoder embeddings and output projection
  • Specifying embedding_padding_modules for proj_out layer
  • Setting supports_lora flag to True

Now Whisper models can use the same multi-LoRA capabilities as LLM models, allowing multiple LoRA adapters to be loaded and used simultaneously with --enable-lora and --lora-modules flags.

Purpose

Test Plan

Test Result


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

This commit enables multi-LoRA functionality for Whisper models by:
- Adding SupportsLoRA interface to WhisperForConditionalGeneration
- Defining embedding_modules mapping for decoder embeddings and output projection
- Specifying embedding_padding_modules for proj_out layer
- Setting supports_lora flag to True

Now Whisper models can use the same multi-LoRA capabilities as LLM models,
allowing multiple LoRA adapters to be loaded and used simultaneously with
--enable-lora and --lora-modules flags.
@github-actions
Copy link

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors.

You ask your reviewers to trigger select CI tests on top of fastcheck CI.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

If you have any questions, please reach out to us on Slack at https://slack.vllm.ai.

🚀

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds multi-LoRA support for Whisper models. The changes are straightforward and correctly implement the SupportsLoRA interface for the WhisperForConditionalGeneration model. The new class attributes embedding_modules and embedding_padding_modules are correctly defined to target the decoder's token embeddings and the output projection layer. The implementation follows the existing pattern for enabling LoRA in other models within the vLLM framework. The changes look good and I don't see any issues.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +805 to +808
embedding_modules = {
"model.decoder.embed_tokens": "input_embeddings",
"proj_out": "output_embeddings",
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Use base module name in embedding_modules mapping

The new LoRA configuration maps decoder embeddings under "model.decoder.embed_tokens". LoRA utils (e.g. LoRAModel.create_dummy_lora and the checkpoint validator) look up embedding modules by the last segment of the module path (module_name.split(".")[-1]). Because the key here includes the full dotted path, those lookups never match embed_tokens, so dummy LoRA weights are not created for the decoder embeddings and any LoRA checkpoint that contains …embed_tokens will be rejected as “unexpected modules.” Other models register keys like "embed_tokens" to avoid this mismatch. The key should use the base name ("embed_tokens") so the decoder embedding is correctly recognised by the LoRA machinery.

Useful? React with 👍 / 👎.

@jeejeelee jeejeelee requested a review from NickLucche November 12, 2025 08:24
Change xformers requirement from specific build version
(0.0.33+5d4b92a5.d20251029) to >=0.0.33 to allow installation
from PyPI without requiring custom built wheels.

The specific build version is not available on PyPI, causing
installation failures. Using >=0.0.33 maintains compatibility
while enabling easier setup.
@mergify
Copy link

mergify bot commented Nov 13, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @daje0601.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Nov 13, 2025
This script automatically patches installed vllm package to add
multi-LoRA support for Whisper models by:
- Adding SupportsLoRA import and interface
- Adding required LoRA attributes (supports_lora, embedding_modules, etc.)
- Creating backup of original file
- Verifying the patch was successful

Usage: python patch_whisper_lora.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

2 participants