This page answers nineninesix/kani-tts-2-en q5_k_m quantization queries with explicit calculations from our model requirement dataset and compatibility speed table.
Estimated from Q4 and Q8 requirement bounds using midpoint interpolation.
Throughput data below uses available compatibility measurements/estimates and is sorted by tokens per second for this model.
Need general guidance? Review full methodology.
| GPU | VRAM | Quantization | Speed | Compatibility |
|---|---|---|---|---|
| AMD Instinct MI300X | 192GB | Q4 | 975 tok/s | View full compatibility |
| NVIDIA H200 SXM 141GB | 141GB | Q4 | 765 tok/s | View full compatibility |
| NVIDIA H100 SXM5 80GB | 80GB | Q4 | 618 tok/s | View full compatibility |
| AMD Instinct MI250X | 128GB | Q4 | 586 tok/s | View full compatibility |
| NVIDIA H100 PCIe 80GB | 80GB | Q4 | 385 tok/s | View full compatibility |
| NVIDIA A100 80GB SXM4 | 80GB | Q4 | 335 tok/s | View full compatibility |
| RTX 5090 | 32GB | Q4 | 333 tok/s | View full compatibility |
| AMD Instinct MI210 | 64GB | Q4 | 297 tok/s | View full compatibility |
| NVIDIA A100 40GB PCIe | 40GB | Q4 | 263 tok/s | View full compatibility |
| NVIDIA RTX 6000 Ada | 48GB | Q4 | 204 tok/s | View full compatibility |
| RTX 4090 | 24GB | Q4 | 199 tok/s | View full compatibility |
| AMD Radeon Pro W7900 | 48GB | Q4 | 189 tok/s | View full compatibility |