This page answers MiniMaxAI/MiniMax-M2.5 q5_k_m quantization queries with explicit calculations from our model requirement dataset and compatibility speed table.
Estimated from Q4 and Q8 requirement bounds using midpoint interpolation.
Throughput data below uses available compatibility measurements/estimates and is sorted by tokens per second for this model.
Need general guidance? Review full methodology.
| GPU | VRAM | Quantization | Speed | Compatibility |
|---|---|---|---|---|
| AMD Instinct MI300X | 192GB | Q4 | 802 tok/s | View full compatibility |
| NVIDIA H200 SXM 141GB | 141GB | Q4 | 645 tok/s | View full compatibility |
| AMD Instinct MI250X | 128GB | Q4 | 470 tok/s | View full compatibility |
| NVIDIA H100 SXM5 80GB | 80GB | Q4 | 469 tok/s | View full compatibility |
| NVIDIA H100 PCIe 80GB | 80GB | Q4 | 343 tok/s | View full compatibility |
| RTX 5090 | 32GB | Q4 | 328 tok/s | View full compatibility |
| NVIDIA A100 80GB SXM4 | 80GB | Q4 | 312 tok/s | View full compatibility |
| AMD Instinct MI210 | 64GB | Q4 | 247 tok/s | View full compatibility |
| NVIDIA A100 40GB PCIe | 40GB | Q4 | 242 tok/s | View full compatibility |
| NVIDIA L40 | 48GB | Q4 | 179 tok/s | View full compatibility |
| NVIDIA RTX 6000 Ada | 48GB | Q4 | 179 tok/s | View full compatibility |
| RTX 4090 | 24GB | Q4 | 174 tok/s | View full compatibility |