@@ -59,23 +59,23 @@ th:not(:first-child) {
59
59
60
60
## Feature x Hardware
61
61
62
- | Feature | Volta | Turing | Ampere | Ada | Hopper | CPU | AMD |
63
- |-----------------------------------------------------------|--------------------|----------|----------|-------|----------|--------------------|-------|
64
- | [CP][chunked-prefill] | [❌](gh-issue:2729) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
65
- | [APC][automatic-prefix-caching] | [❌](gh-issue:3687) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
66
- | [LoRA][lora-adapter] | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
67
- | <abbr title="Prompt Adapter">prmpt adptr</abbr> | ✅ | ✅ | ✅ | ✅ | ✅ | [❌](gh-issue:8475) | ✅ |
68
- | [SD][spec-decode] | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
69
- | CUDA graph | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ | ✅ |
70
- | <abbr title="Pooling Models">pooling</abbr> | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ❔ |
71
- | <abbr title="Encoder-Decoder Models">enc-dec</abbr> | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ |
72
- | <abbr title="Multimodal Inputs">mm</abbr> | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
73
- | <abbr title="Logprobs">logP</abbr> | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
74
- | <abbr title="Prompt Logprobs">prmpt logP</abbr> | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
75
- | <abbr title="Async Output Processing">async output</abbr> | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ |
76
- | multi-step | ✅ | ✅ | ✅ | ✅ | ✅ | [❌](gh-issue:8477) | ✅ |
77
- | best-of | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
78
- | beam-search | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
62
+ | Feature | Volta | Turing | Ampere | Ada | Hopper | CPU | AMD | TPU |
63
+ |-----------------------------------------------------------|--------------------- |----------- |----------- |-------- |------------ |--------------------|--------| -----|
64
+ | [CP][chunked-prefill] | [❌](gh-issue:2729) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |✅ |
65
+ | [APC][automatic-prefix-caching] | [❌](gh-issue:3687) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |✅ |
66
+ | [LoRA][lora-adapter] | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |✅ |
67
+ | <abbr title="Prompt Adapter">prmpt adptr</abbr> | ✅ | ✅ | ✅ | ✅ | ✅ | [❌](gh-issue:8475) | ✅ |✅ |
68
+ | [SD][spec-decode] | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |✅ |
69
+ | CUDA graph | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ | ✅ |✅ |
70
+ | <abbr title="Pooling Models">pooling</abbr> | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ❔ |❌ |
71
+ | <abbr title="Encoder-Decoder Models">enc-dec</abbr> | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ |❌ |
72
+ | <abbr title="Multimodal Inputs">mm</abbr> | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |❌ |
73
+ | <abbr title="Logprobs">logP</abbr> | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |❌ |
74
+ | <abbr title="Prompt Logprobs">prmpt logP</abbr> | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |❌ |
75
+ | <abbr title="Async Output Processing">async output</abbr> | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ |✅ |
76
+ | multi-step | ✅ | ✅ | ✅ | ✅ | ✅ | [❌](gh-issue:8477) | ✅ |✅ |
77
+ | best-of | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |✅ |
78
+ | beam-search | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |✅ |
79
79
80
80
!!! note
81
81
Please refer to [Feature support through NxD Inference backend][feature-support-through-nxd-inference-backend] for features supported on AWS Neuron hardware
0 commit comments