-
Notifications
You must be signed in to change notification settings - Fork 77
Open
Labels
triage/acceptedIndicates an issue or PR is ready to be actively worked on.Indicates an issue or PR is ready to be actively worked on.
Milestone
Description
Need to support mapping of LoRA to base model.
Client uses LoRA name in request and this needs to be mapped to the relevant base model (and InferencePool).
With the removal of InferenceModel from the IGW API, there is no defined way to map a LoRA name to the model.
This can be learnt from convention (e.g., use base
#lora
, or similar, as naming scheme), configuration (e.g., CRDs, should probably consider a design along the lines of EndpointSlice to improve scale) or the model servers directly (e.g., /v1/models
).
This could be done/consumed by BBR, EPP or both.
- provide working examples based on available functionality
- explore options for implementation and draft community proposal
Metadata
Metadata
Assignees
Labels
triage/acceptedIndicates an issue or PR is ready to be actively worked on.Indicates an issue or PR is ready to be actively worked on.