Support for multiple LoRA

Need to support mapping of LoRA to base model.
Client uses LoRA name in request and this needs to be mapped to the relevant base model (and InferencePool).
With the removal of InferenceModel from the IGW API, there is no defined way to map a LoRA name to the model.
This can be learnt from convention (e.g., use `base`#`lora`, or similar, as naming scheme), configuration (e.g., CRDs, should probably consider a design along the lines of EndpointSlice to improve scale) or the model servers directly (e.g., `/v1/models`).
This could be done/consumed by BBR, EPP or both.

- [ ] provide working examples based on available functionality
- [ ] explore options for implementation and draft community proposal

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support for multiple LoRA #339

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support for multiple LoRA #339

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions