Skip to content

Conversation

feloy
Copy link
Contributor

@feloy feloy commented Apr 15, 2025

Signed-off-by: Philippe Martin [email protected]

What does this PR do?

During llama stack container creation:

  • all models present on disk are auto-registered on the llama stack

Screenshot / video of UI

(in the demo below, the chat completion with the model MaziyarPanahi/Mistral-7B-Instruct-v0.3.Q4 fails because the output of llama-stack-client models list truncates the names of models - the name to use should have been MaziyarPanahi/Mistral-7B-Instruct-v0.3.Q4_K_M)

You can see in the demo that all models are registered in the llama stack, and inference servers are started only for the ones used during a chat completion.

llama-stack-register-all-models.mp4

What issues does this PR fix or reference?

Fixes #2840

How to test this PR?

@feloy feloy added the area/ci label Apr 15, 2025
@feloy feloy marked this pull request as ready for review April 15, 2025 11:26
@feloy feloy requested review from benoitf, jeffmaury and a team as code owners April 15, 2025 11:26
@feloy feloy requested review from axel7083, gastoner and slemeur April 15, 2025 11:26
@feloy feloy removed the area/ci label Apr 15, 2025
@jeffmaury
Copy link
Collaborator

Sorry maybe I was not precise enough but we should register all models from AI Lab (if they are not already registered) after the llama-stack container is started without user choice so that user can then play with any model

@feloy
Copy link
Contributor Author

feloy commented Apr 15, 2025

Sorry maybe I was not precise enough but we should register all models from AI Lab (if they are not already registered) after the llama-stack container is started without user choice so that user can then play with any model

I based my work on what is indicated in the issue:

As a very basic and preliminary work, we would provide a new tab "Llama Stack" with a screen enabling to simply start a llama-stack from Podman AI Lab and choosing a specific model

@feloy
Copy link
Contributor Author

feloy commented Apr 15, 2025

I'll change this PR to register all models present in AI Lab (models downloaded)

@feloy feloy marked this pull request as draft April 15, 2025 13:37
@feloy feloy force-pushed the feat-2628/auto-register-model branch from 29e4015 to 85a2f94 Compare April 15, 2025 14:45
@feloy feloy marked this pull request as ready for review April 15, 2025 14:46
@feloy
Copy link
Contributor Author

feloy commented Apr 15, 2025

I'll change this PR to register all models present in AI Lab (models downloaded)

Done, ready for review

@feloy feloy changed the title feat: register model for llama stack feat: register models for llama stack Apr 16, 2025
@feloy feloy merged commit 282770b into containers:main Apr 17, 2025
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

register models when stack is up
3 participants