Skip to content

Support for multiple LoRA #339

@elevran

Description

@elevran

Need to support mapping of LoRA to base model.
Client uses LoRA name in request and this needs to be mapped to the relevant base model (and InferencePool).
With the removal of InferenceModel from the IGW API, there is no defined way to map a LoRA name to the model.
This can be learnt from convention (e.g., use base#lora, or similar, as naming scheme), configuration (e.g., CRDs, should probably consider a design along the lines of EndpointSlice to improve scale) or the model servers directly (e.g., /v1/models).
This could be done/consumed by BBR, EPP or both.

  • provide working examples based on available functionality
  • explore options for implementation and draft community proposal

Metadata

Metadata

Assignees

Labels

triage/acceptedIndicates an issue or PR is ready to be actively worked on.

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions