feat: register models for llama stack #2867

feloy · 2025-04-15T08:15:09Z

Signed-off-by: Philippe Martin [email protected]

What does this PR do?

During llama stack container creation:

all models present on disk are auto-registered on the llama stack

Screenshot / video of UI

(in the demo below, the chat completion with the model MaziyarPanahi/Mistral-7B-Instruct-v0.3.Q4 fails because the output of llama-stack-client models list truncates the names of models - the name to use should have been MaziyarPanahi/Mistral-7B-Instruct-v0.3.Q4_K_M)

You can see in the demo that all models are registered in the llama stack, and inference servers are started only for the ones used during a chat completion.

llama-stack-register-all-models.mp4

What issues does this PR fix or reference?

Fixes #2840

How to test this PR?

jeffmaury · 2025-04-15T12:11:21Z

Sorry maybe I was not precise enough but we should register all models from AI Lab (if they are not already registered) after the llama-stack container is started without user choice so that user can then play with any model

feloy · 2025-04-15T12:20:09Z

Sorry maybe I was not precise enough but we should register all models from AI Lab (if they are not already registered) after the llama-stack container is started without user choice so that user can then play with any model

I based my work on what is indicated in the issue:

As a very basic and preliminary work, we would provide a new tab "Llama Stack" with a screen enabling to simply start a llama-stack from Podman AI Lab and choosing a specific model

feloy · 2025-04-15T13:37:33Z

I'll change this PR to register all models present in AI Lab (models downloaded)

Signed-off-by: Philippe Martin <[email protected]>

feloy · 2025-04-15T14:51:48Z

I'll change this PR to register all models present in AI Lab (models downloaded)

Done, ready for review

feloy added the area/ci label Apr 15, 2025

feloy marked this pull request as ready for review April 15, 2025 11:26

feloy requested review from benoitf, jeffmaury and a team as code owners April 15, 2025 11:26

feloy requested review from axel7083, gastoner and slemeur April 15, 2025 11:26

feloy removed the area/ci label Apr 15, 2025

feloy marked this pull request as draft April 15, 2025 13:37

feat: auto-register all local models to llama stack

85a2f94

Signed-off-by: Philippe Martin <[email protected]>

feloy force-pushed the feat-2628/auto-register-model branch from 29e4015 to 85a2f94 Compare April 15, 2025 14:45

feloy marked this pull request as ready for review April 15, 2025 14:46

feloy changed the title ~~feat: register model for llama stack~~ feat: register models for llama stack Apr 16, 2025

benoitf approved these changes Apr 17, 2025

View reviewed changes

feloy merged commit 282770b into containers:main Apr 17, 2025
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: register models for llama stack #2867

feat: register models for llama stack #2867

Uh oh!

feloy commented Apr 15, 2025 •

edited

Loading

Uh oh!

jeffmaury commented Apr 15, 2025

Uh oh!

feloy commented Apr 15, 2025

Uh oh!

feloy commented Apr 15, 2025

Uh oh!

feloy commented Apr 15, 2025

Uh oh!

Uh oh!

Uh oh!

feat: register models for llama stack #2867

feat: register models for llama stack #2867

Uh oh!

Conversation

feloy commented Apr 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Screenshot / video of UI

What issues does this PR fix or reference?

How to test this PR?

Uh oh!

jeffmaury commented Apr 15, 2025

Uh oh!

feloy commented Apr 15, 2025

Uh oh!

feloy commented Apr 15, 2025

Uh oh!

feloy commented Apr 15, 2025

Uh oh!

Uh oh!

Uh oh!

feloy commented Apr 15, 2025 •

edited

Loading