We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
1 parent 03aa275 commit e602835Copy full SHA for e602835
examples/models/core/llama4/README.md
@@ -8,7 +8,7 @@ This document shows how to run Llama4-Maverick on B200 with PyTorch workflow and
8
- [Performance Benchmarks](#performance-benchmarks)
9
- [B200 Max-throughput](#b200-max-throughput)
10
- [B200 Min-latency](#b200-min-latency)
11
- - [B200 Hybrid](#b200-hybrid)
+ - [B200 Balanced](#b200-balanced)
12
- [Advanced Configuration](#advanced-configuration)
13
- [Configuration tuning](#configuration-tuning)
14
- [Troubleshooting](#troubleshooting)
@@ -120,7 +120,7 @@ python -m tensorrt_llm.serve.scripts.benchmark_serving \
120
--max-concurrency 1 \
121
```
122
123
-### B200 Hybrid
+### B200 Balanced
124
125
126
#### 1. Prepare TensorRT-LLM extra configs
0 commit comments