Skip to content

Commit a5e4c46

Browse files
Add models and features supporting matrix.
Signed-off-by: Qiliang Cui <[email protected]>
1 parent 7b1895e commit a5e4c46

File tree

3 files changed

+50
-17
lines changed

3 files changed

+50
-17
lines changed

docs/.nav.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,7 @@ nav:
3939
- models/generative_models.md
4040
- models/pooling_models.md
4141
- models/extensions
42+
- Hardware Supported Models: models/hardware_supported_models
4243
- Features:
4344
- features/compatibility_matrix.md
4445
- features/*

docs/features/compatibility_matrix.md

Lines changed: 17 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -59,23 +59,23 @@ th:not(:first-child) {
5959

6060
## Feature x Hardware
6161

62-
| Feature | Volta | Turing | Ampere | Ada | Hopper | CPU | AMD |
63-
|-----------------------------------------------------------|--------------------|----------|----------|-------|----------|--------------------|-------|
64-
| [CP][chunked-prefill] | [❌](gh-issue:2729) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
65-
| [APC][automatic-prefix-caching] | [❌](gh-issue:3687) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
66-
| [LoRA][lora-adapter] | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
67-
| <abbr title="Prompt Adapter">prmpt adptr</abbr> | ✅ | ✅ | ✅ | ✅ | ✅ | [❌](gh-issue:8475) | ✅ |
68-
| [SD][spec-decode] | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
69-
| CUDA graph | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ | ✅ |
70-
| <abbr title="Pooling Models">pooling</abbr> | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ❔ |
71-
| <abbr title="Encoder-Decoder Models">enc-dec</abbr> | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ |
72-
| <abbr title="Multimodal Inputs">mm</abbr> | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
73-
| <abbr title="Logprobs">logP</abbr> | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
74-
| <abbr title="Prompt Logprobs">prmpt logP</abbr> | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
75-
| <abbr title="Async Output Processing">async output</abbr> | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ |
76-
| multi-step | ✅ | ✅ | ✅ | ✅ | ✅ | [❌](gh-issue:8477) | ✅ |
77-
| best-of | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
78-
| beam-search | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
62+
| Feature | Volta | Turing | Ampere | Ada | Hopper | CPU | AMD | TPU |
63+
|-----------------------------------------------------------|---------------------|-----------|-----------|--------|------------|--------------------|--------|-----|
64+
| [CP][chunked-prefill] | [❌](gh-issue:2729) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |✅ |
65+
| [APC][automatic-prefix-caching] | [❌](gh-issue:3687) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |✅ |
66+
| [LoRA][lora-adapter] | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |✅ |
67+
| <abbr title="Prompt Adapter">prmpt adptr</abbr> | ✅ | ✅ | ✅ | ✅ | ✅ | [❌](gh-issue:8475) | ✅ |✅ |
68+
| [SD][spec-decode] | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |✅ |
69+
| CUDA graph | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ | ✅ |✅ |
70+
| <abbr title="Pooling Models">pooling</abbr> | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ❔ |❌ |
71+
| <abbr title="Encoder-Decoder Models">enc-dec</abbr> | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ |❌ |
72+
| <abbr title="Multimodal Inputs">mm</abbr> | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |❌ |
73+
| <abbr title="Logprobs">logP</abbr> | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |❌ |
74+
| <abbr title="Prompt Logprobs">prmpt logP</abbr> | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |❌ |
75+
| <abbr title="Async Output Processing">async output</abbr> | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ |✅ |
76+
| multi-step | ✅ | ✅ | ✅ | ✅ | ✅ | [❌](gh-issue:8477) | ✅ |✅ |
77+
| best-of | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |✅ |
78+
| beam-search | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |✅ |
7979

8080
!!! note
8181
Please refer to [Feature support through NxD Inference backend][feature-support-through-nxd-inference-backend] for features supported on AWS Neuron hardware
Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
---
2+
title: TPU
3+
---
4+
[](){ #tpu-supported-models }
5+
6+
# TPU Supported Models
7+
## Text-only Language Models
8+
9+
| Model | Supported |
10+
|-------|-----------|
11+
| mistralai/Mixtral-8x7B-Instruct-v0.1 | 🟨 |
12+
| mistralai/Mistral-Small-24B-Instruct-2501 | ✅ |
13+
| mistralai/Codestral-22B-v0.1 | ✅ |
14+
| mistralai/Mixtral-8x22B-Instruct-v0.1 | ❌ |
15+
| meta-llama/Llama-3.3-70B-Instruct | ✅ |
16+
| meta-llama/Llama-3.1-8B-Instruct | 🟨 |
17+
| meta-llama/Llama-3.1-70B-Instruct | ✅ |
18+
| meta-llama/Llama-4-* | ❌ |
19+
| microsoft/Phi-3-mini-128k-instruct | 🟨 |
20+
| microsoft/phi-4 | ❌ |
21+
| google/gemma-3-27b-it | 🟨 |
22+
| google/gemma-3-4b-it | ❌ |
23+
| deepseek-ai/DeepSeek-R1 | ❌ |
24+
| deepseek-ai/DeepSeek-V3 | ❌ |
25+
| RedHatAI/Meta-Llama-3.1-8B-Instruct-quantized.w8a8 | 🟨 |
26+
| RedHatAI/Meta-Llama-3.1-70B-Instruct-quantized.w8a8 | ✅ |
27+
| Qwen/Qwen3-8B | 🟨 |
28+
| Qwen/Qwen3-32B | ✅ |
29+
| Qwen/Qwen2.5-7B-Instruct | ✅ |
30+
| Qwen/Qwen2.5-32B | ✅ |
31+
| Qwen/Qwen2.5-14B-Instruct | ✅ |
32+
| Qwen/Qwen2.5-1.5B-Instruct | 🟨 |

0 commit comments

Comments
 (0)