This page answers MiniMaxAI/MiniMax-M2.5 fp16 queries with explicit calculations from our model requirement dataset and compatibility speed table.
Exact FP16 requirement from model requirement data.
Throughput data below uses available compatibility measurements/estimates and is sorted by tokens per second for this model.
Need general guidance? Review full methodology.
| GPU | VRAM | Quantization | Speed | Compatibility |
|---|---|---|---|---|
| AMD Instinct MI300X | 192GB | FP16 | 295 tok/s | View full compatibility |
| NVIDIA H200 SXM 141GB | 141GB | FP16 | 286 tok/s | View full compatibility |
| NVIDIA H100 SXM5 80GB | 80GB | FP16 | 187 tok/s | View full compatibility |
| AMD Instinct MI250X | 128GB | FP16 | 184 tok/s | View full compatibility |
| NVIDIA L40 | 48GB | Q4 | 179 tok/s | View full compatibility |
| NVIDIA RTX 6000 Ada | 48GB | Q4 | 179 tok/s | View full compatibility |
| RTX 4090 | 24GB | Q4 | 174 tok/s | View full compatibility |
| NVIDIA L40S | 48GB | Q4 | 174 tok/s | View full compatibility |
| RTX 5080 | 16GB | Q4 | 168 tok/s | View full compatibility |
| RX 7900 XTX | 24GB | Q4 | 150 tok/s | View full compatibility |
| RTX 3090 | 24GB | Q4 | 143 tok/s | View full compatibility |
| AMD Radeon Pro W7900 | 48GB | Q4 | 142 tok/s | View full compatibility |