Quick Answer: AMD Instinct MI250X offers 128GB VRAM and starts around current market pricing. It delivers approximately 716 tokens/sec on Deepseek AI Deepseek Ocr 2. It typically draws 560W under load.

AMD Instinct MI250X

Check availability

By AMDReleased 2021-11MSRP $11,000.00

This GPU offers reliable throughput for local AI workloads. Pair it with the right model quantization to hit your desired tokens/sec, and monitor prices below to catch the best deal.

Search on Amazon View Benchmarks

Specs snapshot

Key hardware metrics for AI workloads.

VRAM128GB

Cores14,080

TDP560W

ArchitectureCDNA 2

Key Takeaways

128GB VRAM - runs models up to ~320B parameters
High-end compute for demanding workloads
High power draw (560W) - requires robust PSU (850W+ recommended)
Strong price-to-VRAM value

What this means for you

With 128GB VRAM, AMD Instinct MI250X can run models up to approximately 320B parameters using 4-bit quantization. This handles most popular models including Llama 3 70B, Mistral 7B, and larger.

Who should buy

Professional AI workloads requiring maximum VRAM
Running 100B+ parameter models with full precision

Looking to upgrade?

Consider H100 or MI300X — Maximum VRAM for enterprise workloads.

AI benchmarks

Showing 80 of 80 benchmark rows.

Model	Size	Quantization	Tokens/sec	VRAM used
Deepseek AI Deepseek Ocr 2	Unknown	Q4	716.33 tok/sEstimated Static estimation (DB-independent)	1GB
Deepseek AI Deepseek Math V2	Unknown	Q4	716.33 tok/sEstimated Static estimation (DB-independent)	1GB
Deepseek AI Deepseek V2 5	Unknown	Q4	716.33 tok/sEstimated Static estimation (DB-independent)	1GB
Deepseek AI Deepseek V3	Unknown	Q4	716.33 tok/sEstimated Static estimation (DB-independent)	2GB
Deepseek AI Deepseek Coder V2 Lite Instruct	Unknown	Q4	716.33 tok/sEstimated Static estimation (DB-independent)	1GB
Deepseek AI Deepseek V3.1	Unknown	Q4	716.33 tok/sEstimated Static estimation (DB-independent)	2GB
Deepseek AI Deepseek Coder 1.3B Instruct	1.3B	Q4	716.33 tok/sEstimated Static estimation (DB-independent)	1GB
Deepseek AI Deepseek R1 Distill Qwen 1.5B	1.5B	Q4	716.33 tok/sEstimated Static estimation (DB-independent)	1GB
Deepseek AI Deepseek Ocr	Unknown	Q4	596.94 tok/sEstimated Static estimation (DB-independent)	4GB
Lmstudio Community Deepseek R1 0528 Qwen3 8B Mlx 8bit	8B	Q4	596.94 tok/sEstimated Static estimation (DB-independent)	4GB
Lmstudio Community Deepseek R1 0528 Qwen3 8B Mlx 4bit	8B	Q4	596.94 tok/sEstimated Static estimation (DB-independent)	4GB
Deepseek AI Deepseek R1	Unknown	Q4	596.94 tok/sEstimated Static estimation (DB-independent)	4GB
Deepseek AI Deepseek R1 0528	Unknown	Q4	596.94 tok/sEstimated Static estimation (DB-independent)	4GB
Deepseek AI Deepseek R1 Distill Llama 8B	8B	Q4	596.94 tok/sEstimated Static estimation (DB-independent)	4GB
Deepseek AI Deepseek R1 Distill Qwen 7B	7B	Q4	596.94 tok/sEstimated Static estimation (DB-independent)	4GB
Nineninesix Kani Tts 2 En	Unknown	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	1GB
Nanbeige Nanbeige4 1 3B	3B	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	2GB
Minimaxai Minimax M2 5	Unknown	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	1GB
Minimaxai Minimax M2 1	Unknown	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	1GB
Stepfun AI Step 3 5 Flash	Unknown	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	2GB
Qwen Qwen3 Coder Next	Unknown	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	2GB
Moonshotai Kimi K2 5	Unknown	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	1GB
Xiaomimimo Mimo V2 Flash	Unknown	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	1GB
Nari Labs Dia2 2B	2B	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	1GB
Google Embeddinggemma 300M	Unknown	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	1GB
Facebook Sam3	Unknown	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	2GB
Black Forest Labs Flux 2 Dev	Unknown	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	1GB
Moonshotai Kimi K2 Thinking	Unknown	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	1GB
Microsoft Phi 3 5 Mini Instruct	Unknown	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	2GB
Meta Llama Llama 3 2 3B Instruct	3B	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	2GB
Qwen Qwen3 1.7B Base	1.7B	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	1GB
Dicta Il Dictalm2.0 Instruct	Unknown	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	1GB
Qwen Qwen2 0.5B Instruct	0.5B	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	1GB
Alibaba Nlp Gte Qwen2 1.5B Instruct	1.5B	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	1GB
Apple Openelm 1 1B Instruct	1B	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	1GB
Qwen Qwen2.5 3B	3B	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	2GB
Unsloth Gemma 3 1B It	1B	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	1GB
Bigcode Starcoder2 3B	3B	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	2GB
Ibm Granite Granite Docling 258M	Unknown	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	1GB
Skt Kogpt2 Base V2	Unknown	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	1GB
Google Gemma 3 270M It	Unknown	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	1GB
Eleutherai Pythia 70M Deduped	Unknown	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	1GB
Microsoft Vibevoice 1.5B	1.5B	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	1GB
Ibm Granite Granite 3.3 2B Instruct	2B	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	1GB
Google Gemma 2B	2B	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	1GB
Trl Internal Testing Tiny Llamaforcausallm 3.2	Unknown	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	2GB
Llamafactory Tiny Random Llama 3	Unknown	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	2GB
Unsloth Llama 3.2 1B Instruct	1B	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	1GB
Numind Nuextract 1.5	Unknown	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	1GB
Hmellor Tiny Random Llamaforcausallm	Unknown	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	1GB
Sshleifer Tiny Gpt2	Unknown	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	1GB
Openai Community Gpt2 Xl	Unknown	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	1GB
Ibm Research Powermoe 3B	3B	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	2GB
Unsloth Llama 3.2 3B Instruct	3B	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	2GB
Meta Llama Llama 3.2 3B	3B	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	2GB
Eleutherai Gpt Neo 125M	Unknown	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	1GB
Meta Llama Llama Guard 3 1B	1B	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	1GB
Qwen Qwen2 1.5B Instruct	1.5B	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	1GB
Google Gemma 2 2B It	2B	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	1GB
Microsoft Phi 3.5 Mini Instruct	Unknown	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	2GB
Microsoft Phi 3.5 Vision Instruct	Unknown	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	2GB
Rinna Japanese Gpt Neox Small	Unknown	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	1GB
Qwen Qwen2.5 Coder 1.5B	1.5B	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	1GB
Microsoft Dialogpt Small	Unknown	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	1GB
Qwen Qwen3 0.6B Base	0.6B	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	1GB
Openai Community Gpt2 Medium	Unknown	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	1GB
Trl Internal Testing Tiny Random Llamaforcausallm	Unknown	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	1GB
Qwen Qwen2.5 Math 1.5B	1.5B	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	1GB
Huggingfacetb Smollm 135M	Unknown	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	1GB
Liquidai Lfm2 1.2B	1.2B	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	1GB
Qwen Qwen2 0.5B	0.5B	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	1GB
Minimaxai Minimax M2	Unknown	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	1GB
Huggingfacetb Smollm2 135M	Unknown	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	1GB
Microsoft Phi 2	Unknown	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	1GB
Qwen Qwen2.5 0.5B	0.5B	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	1GB
Qwen Qwen2.5 1.5B	1.5B	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	1GB
Qwen Qwen3 Reranker 0.6B	0.6B	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	1GB
Google T5 T5 3B	3B	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	2GB
Qwen Qwen3 1.7B	1.7B	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	1GB
Openai Community Gpt2 Large	Unknown	Q4	573.07 tok/sEstimated Static estimation (DB-independent)	1GB

Deepseek AI Deepseek Ocr 2

Q4 · Unknown

1GB

716.33 tok/sEstimated