[Question] Blackwell (B200) performance

Anyone have B200 Tok/s numbers they can share? I swear I had this working well at some point, but now I'm getting numbers far lower than I would expect.

For olmocr I'm seeing something like
```
Avg prompt throughput: 3431.0 tokens/s
Avg generation throughput: 1062.0 tokens/s
```

but with a text only model like Qwen3-30B-A3B-Instruct-2507 I'm getting closer to 45,000 Tok/s on prefill and 5k Tok/s on generation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Question] Blackwell (B200) performance #302

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Question] Blackwell (B200) performance #302

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions