L
localai.computer
ModelsGPUsSystemsAI SetupsBuildsMethodology

Resources

  • Methodology
  • Submit Benchmark
  • About

Browse

  • AI Models
  • GPUs
  • PC Builds

Community

  • Leaderboard

Legal

  • Privacy
  • Terms
  • Contact

© 2025 localai.computer. Hardware recommendations for running AI models locally.

ℹ️We earn from qualifying purchases through affiliate links at no extra cost to you. This supports our free content and research.

  1. Home
  2. GPUs
  3. RX 7900 XTX

Quick Answer: RX 7900 XTX offers 24GB VRAM and starts around $99.99. It delivers approximately 192 tokens/sec on WeiboAI/VibeThinker-1.5B. It typically draws 355W under load.

RX 7900 XTX

In Stock
By AMDReleased 2022-12MSRP $999.00

RX 7900 XTX gives AMD builders a 24GB option with competitive throughput for 7B–13B LLMs and diffusion workloads. Use ROCm-compatible stacks like llama.cpp or vLLM (AMD fork).

Buy on Amazon - $99.99View Benchmarks
Specs snapshot
Key hardware metrics for AI workloads.
VRAM24GB
Cores6,144
TDP355W
ArchitectureRDNA 3

Where to Buy

Buy directly on Amazon with fast shipping and reliable customer service.

AmazonIn Stock
$99.99
Buy on Amazon

More Amazon options

Rotate out primary variants whenever validation flags an issue.

💡 Not ready to buy? Try cloud GPUs first

Test RX 7900 XTX performance in the cloud before investing in hardware. Pay by the hour with no commitment.

Vast.aifrom $0.20/hrRunPodfrom $0.30/hrLambda Labsenterprise-grade

AI benchmarks

ModelQuantizationTokens/secVRAM used
WeiboAI/VibeThinker-1.5BQ4
191.98 tok/sEstimated

Auto-generated benchmark

1GB
Qwen/Qwen2.5-3B-InstructQ4
191.65 tok/sEstimated

Auto-generated benchmark

2GB
ibm-research/PowerMoE-3bQ4
191.32 tok/sEstimated

Auto-generated benchmark

2GB
allenai/OLMo-2-0425-1BQ4
190.23 tok/sEstimated

Auto-generated benchmark

1GB
ibm-granite/granite-3.3-2b-instructQ4
188.87 tok/sEstimated

Auto-generated benchmark

1GB
TinyLlama/TinyLlama-1.1B-Chat-v1.0Q4
187.96 tok/sEstimated

Auto-generated benchmark

1GB
unsloth/Llama-3.2-3B-InstructQ4
186.92 tok/sEstimated

Auto-generated benchmark

2GB
tencent/HunyuanOCRQ4
183.83 tok/sEstimated

Auto-generated benchmark

1GB
apple/OpenELM-1_1B-InstructQ4
182.32 tok/sEstimated

Auto-generated benchmark

1GB
unsloth/Llama-3.2-1B-InstructQ4
182.10 tok/sEstimated

Auto-generated benchmark

1GB
inference-net/Schematron-3BQ4
182.08 tok/sEstimated

Auto-generated benchmark

2GB
meta-llama/Llama-Guard-3-1BQ4
180.34 tok/sEstimated

Auto-generated benchmark

1GB
unsloth/gemma-3-1b-itQ4
179.86 tok/sEstimated

Auto-generated benchmark

1GB
facebook/sam3Q4
178.70 tok/sEstimated

Auto-generated benchmark

1GB
meta-llama/Llama-3.2-3B-InstructQ4
177.84 tok/sEstimated

Auto-generated benchmark

2GB
google/gemma-2bQ4
177.84 tok/sEstimated

Auto-generated benchmark

1GB
google-bert/bert-base-uncasedQ4
176.40 tok/sEstimated

Auto-generated benchmark

1GB
deepseek-ai/DeepSeek-OCRQ4
173.09 tok/sEstimated

Auto-generated benchmark

2GB
meta-llama/Llama-3.2-3BQ4
171.53 tok/sEstimated

Auto-generated benchmark

2GB
meta-llama/Llama-3.2-1BQ4
171.50 tok/sEstimated

Auto-generated benchmark

1GB
context-labs/meta-llama-Llama-3.2-3B-Instruct-FP16Q4
170.23 tok/sEstimated

Auto-generated benchmark

2GB
Qwen/Qwen2.5-3BQ4
169.92 tok/sEstimated

Auto-generated benchmark

2GB
google/gemma-2-2b-itQ4
169.47 tok/sEstimated

Auto-generated benchmark

1GB
bigcode/starcoder2-3bQ4
167.23 tok/sEstimated

Auto-generated benchmark

2GB
LiquidAI/LFM2-1.2BQ4
166.42 tok/sEstimated

Auto-generated benchmark

1GB
google/embeddinggemma-300mQ4
162.44 tok/sEstimated

Auto-generated benchmark

1GB
microsoft/Phi-4-multimodal-instructQ4
159.91 tok/sEstimated

Auto-generated benchmark

4GB
GSAI-ML/LLaDA-8B-InstructQ4
159.65 tok/sEstimated

Auto-generated benchmark

4GB
meta-llama/Llama-3.2-1B-InstructQ4
159.52 tok/sEstimated

Auto-generated benchmark

1GB
deepseek-ai/deepseek-coder-1.3b-instructQ4
159.48 tok/sEstimated

Auto-generated benchmark

2GB
petals-team/StableBeluga2Q4
158.43 tok/sEstimated

Auto-generated benchmark

4GB
unsloth/mistral-7b-v0.3-bnb-4bitQ4
158.16 tok/sEstimated

Auto-generated benchmark

4GB
google/gemma-3-1b-itQ4
158.14 tok/sEstimated

Auto-generated benchmark

1GB
nari-labs/Dia2-2BQ4
157.92 tok/sEstimated

Auto-generated benchmark

2GB
Gensyn/Qwen2.5-0.5B-InstructQ4
157.49 tok/sEstimated

Auto-generated benchmark

3GB
google-t5/t5-3bQ4
157.37 tok/sEstimated

Auto-generated benchmark

2GB
Qwen/Qwen3-0.6BQ4
156.94 tok/sEstimated

Auto-generated benchmark

3GB
microsoft/Phi-3.5-vision-instructQ4
156.93 tok/sEstimated

Auto-generated benchmark

4GB
kaitchup/Phi-3-mini-4k-instruct-gptq-4bitQ4
156.86 tok/sEstimated

Auto-generated benchmark

2GB
Qwen/Qwen3-4B-Thinking-2507Q4
156.77 tok/sEstimated

Auto-generated benchmark

2GB
meta-llama/Llama-2-7b-chat-hfQ4
156.70 tok/sEstimated

Auto-generated benchmark

4GB
meta-llama/Llama-3.1-8BQ4
156.43 tok/sEstimated

Auto-generated benchmark

4GB
HuggingFaceH4/zephyr-7b-betaQ4
156.01 tok/sEstimated

Auto-generated benchmark

4GB
skt/kogpt2-base-v2Q4
155.91 tok/sEstimated

Auto-generated benchmark

4GB
microsoft/Phi-3.5-mini-instructQ4
155.34 tok/sEstimated

Auto-generated benchmark

2GB
IlyaGusev/saiga_llama3_8bQ4
154.62 tok/sEstimated

Auto-generated benchmark

4GB
Qwen/Qwen3-4B-Thinking-2507-FP8Q4
154.60 tok/sEstimated

Auto-generated benchmark

2GB
Qwen/Qwen2.5-1.5BQ4
154.56 tok/sEstimated

Auto-generated benchmark

3GB
meta-llama/Meta-Llama-3-8BQ4
154.46 tok/sEstimated

Auto-generated benchmark

4GB
unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bitQ4
154.43 tok/sEstimated

Auto-generated benchmark

4GB
WeiboAI/VibeThinker-1.5B
Q4
1GB
191.98 tok/sEstimated
Auto-generated benchmark
Qwen/Qwen2.5-3B-Instruct
Q4
2GB
191.65 tok/sEstimated
Auto-generated benchmark
ibm-research/PowerMoE-3b
Q4
2GB
191.32 tok/sEstimated
Auto-generated benchmark
allenai/OLMo-2-0425-1B
Q4
1GB
190.23 tok/sEstimated
Auto-generated benchmark
ibm-granite/granite-3.3-2b-instruct
Q4
1GB
188.87 tok/sEstimated
Auto-generated benchmark
TinyLlama/TinyLlama-1.1B-Chat-v1.0
Q4
1GB
187.96 tok/sEstimated
Auto-generated benchmark
unsloth/Llama-3.2-3B-Instruct
Q4
2GB
186.92 tok/sEstimated
Auto-generated benchmark
tencent/HunyuanOCR
Q4
1GB
183.83 tok/sEstimated
Auto-generated benchmark
apple/OpenELM-1_1B-Instruct
Q4
1GB
182.32 tok/sEstimated
Auto-generated benchmark
unsloth/Llama-3.2-1B-Instruct
Q4
1GB
182.10 tok/sEstimated
Auto-generated benchmark
inference-net/Schematron-3B
Q4
2GB
182.08 tok/sEstimated
Auto-generated benchmark
meta-llama/Llama-Guard-3-1B
Q4
1GB
180.34 tok/sEstimated
Auto-generated benchmark
unsloth/gemma-3-1b-it
Q4
1GB
179.86 tok/sEstimated
Auto-generated benchmark
facebook/sam3
Q4
1GB
178.70 tok/sEstimated
Auto-generated benchmark
meta-llama/Llama-3.2-3B-Instruct
Q4
2GB
177.84 tok/sEstimated
Auto-generated benchmark
google/gemma-2b
Q4
1GB
177.84 tok/sEstimated
Auto-generated benchmark
google-bert/bert-base-uncased
Q4
1GB
176.40 tok/sEstimated
Auto-generated benchmark
deepseek-ai/DeepSeek-OCR
Q4
2GB
173.09 tok/sEstimated
Auto-generated benchmark
meta-llama/Llama-3.2-3B
Q4
2GB
171.53 tok/sEstimated
Auto-generated benchmark
meta-llama/Llama-3.2-1B
Q4
1GB
171.50 tok/sEstimated
Auto-generated benchmark
context-labs/meta-llama-Llama-3.2-3B-Instruct-FP16
Q4
2GB
170.23 tok/sEstimated
Auto-generated benchmark
Qwen/Qwen2.5-3B
Q4
2GB
169.92 tok/sEstimated
Auto-generated benchmark
google/gemma-2-2b-it
Q4
1GB
169.47 tok/sEstimated
Auto-generated benchmark
bigcode/starcoder2-3b
Q4
2GB
167.23 tok/sEstimated
Auto-generated benchmark
LiquidAI/LFM2-1.2B
Q4
1GB
166.42 tok/sEstimated
Auto-generated benchmark
google/embeddinggemma-300m
Q4
1GB
162.44 tok/sEstimated
Auto-generated benchmark
microsoft/Phi-4-multimodal-instruct
Q4
4GB
159.91 tok/sEstimated
Auto-generated benchmark
GSAI-ML/LLaDA-8B-Instruct
Q4
4GB
159.65 tok/sEstimated
Auto-generated benchmark
meta-llama/Llama-3.2-1B-Instruct
Q4
1GB
159.52 tok/sEstimated
Auto-generated benchmark
deepseek-ai/deepseek-coder-1.3b-instruct
Q4
2GB
159.48 tok/sEstimated
Auto-generated benchmark
petals-team/StableBeluga2
Q4
4GB
158.43 tok/sEstimated
Auto-generated benchmark
unsloth/mistral-7b-v0.3-bnb-4bit
Q4
4GB
158.16 tok/sEstimated
Auto-generated benchmark
google/gemma-3-1b-it
Q4
1GB
158.14 tok/sEstimated
Auto-generated benchmark
nari-labs/Dia2-2B
Q4
2GB
157.92 tok/sEstimated
Auto-generated benchmark
Gensyn/Qwen2.5-0.5B-Instruct
Q4
3GB
157.49 tok/sEstimated
Auto-generated benchmark
google-t5/t5-3b
Q4
2GB
157.37 tok/sEstimated
Auto-generated benchmark
Qwen/Qwen3-0.6B
Q4
3GB
156.94 tok/sEstimated
Auto-generated benchmark
microsoft/Phi-3.5-vision-instruct
Q4
4GB
156.93 tok/sEstimated
Auto-generated benchmark
kaitchup/Phi-3-mini-4k-instruct-gptq-4bit
Q4
2GB
156.86 tok/sEstimated
Auto-generated benchmark
Qwen/Qwen3-4B-Thinking-2507
Q4
2GB
156.77 tok/sEstimated
Auto-generated benchmark
meta-llama/Llama-2-7b-chat-hf
Q4
4GB
156.70 tok/sEstimated
Auto-generated benchmark
meta-llama/Llama-3.1-8B
Q4
4GB
156.43 tok/sEstimated
Auto-generated benchmark
HuggingFaceH4/zephyr-7b-beta
Q4
4GB
156.01 tok/sEstimated
Auto-generated benchmark
skt/kogpt2-base-v2
Q4
4GB
155.91 tok/sEstimated
Auto-generated benchmark
microsoft/Phi-3.5-mini-instruct
Q4
2GB
155.34 tok/sEstimated
Auto-generated benchmark
IlyaGusev/saiga_llama3_8b
Q4
4GB
154.62 tok/sEstimated
Auto-generated benchmark
Qwen/Qwen3-4B-Thinking-2507-FP8
Q4
2GB
154.60 tok/sEstimated
Auto-generated benchmark
Qwen/Qwen2.5-1.5B
Q4
3GB
154.56 tok/sEstimated
Auto-generated benchmark
meta-llama/Meta-Llama-3-8B
Q4
4GB
154.46 tok/sEstimated
Auto-generated benchmark
unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit
Q4
4GB
154.43 tok/sEstimated
Auto-generated benchmark

Note: Performance estimates are calculated. Real results may vary. Methodology · Submit real data

Model compatibility

ModelQuantizationVerdictEstimated speedVRAM needed
HuggingFaceH4/zephyr-7b-betaQ4Fits comfortably
156.01 tok/sEstimated
4GB (have 24GB)
liuhaotian/llava-v1.5-7bQ4Fits comfortably
150.77 tok/sEstimated
4GB (have 24GB)
Qwen/Qwen2.5-72B-InstructQ8Not supported
20.20 tok/sEstimated
70GB (have 24GB)
Qwen/Qwen2.5-72B-InstructFP16Not supported
11.56 tok/sEstimated
141GB (have 24GB)
BSC-LT/salamandraTA-7b-instructQ8Fits comfortably
104.26 tok/sEstimated
7GB (have 24GB)
deepseek-ai/deepseek-coder-33b-instructQ4Fits comfortably
50.76 tok/sEstimated
17GB (have 24GB)
deepseek-ai/deepseek-coder-33b-instructQ8Not supported
35.83 tok/sEstimated
34GB (have 24GB)
meta-llama/Llama-3.2-3B-InstructQ4Fits comfortably
141.65 tok/sEstimated
2GB (have 24GB)
meta-llama/Llama-3.2-3B-InstructQ8Fits comfortably
97.68 tok/sEstimated
3GB (have 24GB)
Qwen/Qwen3-Next-80B-A3B-ThinkingQ8Not supported
21.52 tok/sEstimated
78GB (have 24GB)
Qwen/Qwen3-Next-80B-A3B-ThinkingFP16Not supported
10.51 tok/sEstimated
156GB (have 24GB)
meta-llama/Llama-2-13b-chat-hfQ4Fits comfortably
102.99 tok/sEstimated
7GB (have 24GB)
Qwen/Qwen3-30B-A3B-Thinking-2507FP16Not supported
27.41 tok/sEstimated
61GB (have 24GB)
lmstudio-community/Qwen3-Coder-30B-A3B-Instruct-MLX-5bitQ8Not supported
58.16 tok/sEstimated
31GB (have 24GB)
lmstudio-community/Qwen3-Coder-30B-A3B-Instruct-MLX-5bitFP16Not supported
33.16 tok/sEstimated
61GB (have 24GB)
Alibaba-NLP/gte-Qwen2-1.5B-instructQ4Fits comfortably
149.48 tok/sEstimated
3GB (have 24GB)
mistralai/Mistral-Small-Instruct-2409Q4Fits comfortably
75.77 tok/sEstimated
11GB (have 24GB)
mistralai/Mistral-Small-Instruct-2409Q8Fits (tight)
57.28 tok/sEstimated
23GB (have 24GB)
google/gemma-2-27b-itQ4Fits comfortably
77.91 tok/sEstimated
14GB (have 24GB)
google/gemma-2-27b-itQ8Not supported
58.13 tok/sEstimated
28GB (have 24GB)
black-forest-labs/FLUX.2-devFP16Fits comfortably
60.22 tok/sEstimated
16GB (have 24GB)
OpenPipe/Qwen3-14B-InstructQ8Fits comfortably
82.40 tok/sEstimated
14GB (have 24GB)
OpenPipe/Qwen3-14B-InstructFP16Not supported
38.62 tok/sEstimated
29GB (have 24GB)
openai/gpt-oss-120bQ4Not supported
27.67 tok/sEstimated
59GB (have 24GB)
Qwen/Qwen3-4BQ4Fits comfortably
132.22 tok/sEstimated
2GB (have 24GB)
OpenPipe/Qwen3-14B-InstructQ4Fits comfortably
98.86 tok/sEstimated
7GB (have 24GB)
openai-community/gpt2-xlQ4Fits comfortably
135.22 tok/sEstimated
4GB (have 24GB)
meta-llama/Llama-3.2-3B-InstructQ8Fits comfortably
117.75 tok/sEstimated
3GB (have 24GB)
deepseek-ai/DeepSeek-Coder-V2-Lite-InstructQ4Fits comfortably
144.19 tok/sEstimated
4GB (have 24GB)
facebook/opt-125mQ8Fits comfortably
100.64 tok/sEstimated
7GB (have 24GB)
kaitchup/Phi-3-mini-4k-instruct-gptq-4bitQ4Fits comfortably
156.86 tok/sEstimated
2GB (have 24GB)
kaitchup/Phi-3-mini-4k-instruct-gptq-4bitFP16Fits comfortably
59.08 tok/sEstimated
9GB (have 24GB)
Qwen/Qwen2.5-1.5BFP16Fits comfortably
54.08 tok/sEstimated
11GB (have 24GB)
Qwen/Qwen2.5-14B-InstructQ4Fits comfortably
102.01 tok/sEstimated
7GB (have 24GB)
Qwen/Qwen2.5-14B-InstructQ8Fits comfortably
70.33 tok/sEstimated
14GB (have 24GB)
meta-llama/Llama-3.3-70B-InstructFP16Not supported
20.73 tok/sEstimated
137GB (have 24GB)
Qwen/Qwen3-Embedding-8BQ4Fits comfortably
150.33 tok/sEstimated
4GB (have 24GB)
Qwen/Qwen3-Embedding-8BQ8Fits comfortably
102.05 tok/sEstimated
9GB (have 24GB)
Qwen/Qwen3-14BQ8Fits comfortably
74.21 tok/sEstimated
14GB (have 24GB)
Qwen/Qwen3-14BFP16Not supported
38.58 tok/sEstimated
29GB (have 24GB)
deepseek-ai/DeepSeek-R1-Distill-Qwen-7BQ4Fits comfortably
152.24 tok/sEstimated
4GB (have 24GB)
deepseek-ai/DeepSeek-R1-Distill-Qwen-7BQ8Fits comfortably
105.47 tok/sEstimated
7GB (have 24GB)
meta-llama/Llama-2-7b-hfFP16Fits comfortably
57.52 tok/sEstimated
15GB (have 24GB)
deepseek-ai/DeepSeek-R1-Distill-Llama-8BQ4Fits comfortably
142.19 tok/sEstimated
4GB (have 24GB)
deepseek-ai/DeepSeek-R1-Distill-Llama-8BQ8Fits comfortably
109.91 tok/sEstimated
9GB (have 24GB)
Qwen/Qwen2-0.5BQ4Fits comfortably
139.70 tok/sEstimated
3GB (have 24GB)
Qwen/Qwen2-0.5BQ8Fits comfortably
101.81 tok/sEstimated
5GB (have 24GB)
deepseek-ai/deepseek-coder-1.3b-instructQ4Fits comfortably
159.48 tok/sEstimated
2GB (have 24GB)
deepseek-ai/deepseek-coder-1.3b-instructQ8Fits comfortably
121.88 tok/sEstimated
3GB (have 24GB)
Qwen/Qwen3-4B-Thinking-2507FP16Fits comfortably
59.02 tok/sEstimated
9GB (have 24GB)
HuggingFaceH4/zephyr-7b-betaQ4
Fits comfortably4GB required · 24GB available
156.01 tok/sEstimated
liuhaotian/llava-v1.5-7bQ4
Fits comfortably4GB required · 24GB available
150.77 tok/sEstimated
Qwen/Qwen2.5-72B-InstructQ8
Not supported70GB required · 24GB available
20.20 tok/sEstimated
Qwen/Qwen2.5-72B-InstructFP16
Not supported141GB required · 24GB available
11.56 tok/sEstimated
BSC-LT/salamandraTA-7b-instructQ8
Fits comfortably7GB required · 24GB available
104.26 tok/sEstimated
deepseek-ai/deepseek-coder-33b-instructQ4
Fits comfortably17GB required · 24GB available
50.76 tok/sEstimated
deepseek-ai/deepseek-coder-33b-instructQ8
Not supported34GB required · 24GB available
35.83 tok/sEstimated
meta-llama/Llama-3.2-3B-InstructQ4
Fits comfortably2GB required · 24GB available
141.65 tok/sEstimated
meta-llama/Llama-3.2-3B-InstructQ8
Fits comfortably3GB required · 24GB available
97.68 tok/sEstimated
Qwen/Qwen3-Next-80B-A3B-ThinkingQ8
Not supported78GB required · 24GB available
21.52 tok/sEstimated
Qwen/Qwen3-Next-80B-A3B-ThinkingFP16
Not supported156GB required · 24GB available
10.51 tok/sEstimated
meta-llama/Llama-2-13b-chat-hfQ4
Fits comfortably7GB required · 24GB available
102.99 tok/sEstimated
Qwen/Qwen3-30B-A3B-Thinking-2507FP16
Not supported61GB required · 24GB available
27.41 tok/sEstimated
lmstudio-community/Qwen3-Coder-30B-A3B-Instruct-MLX-5bitQ8
Not supported31GB required · 24GB available
58.16 tok/sEstimated
lmstudio-community/Qwen3-Coder-30B-A3B-Instruct-MLX-5bitFP16
Not supported61GB required · 24GB available
33.16 tok/sEstimated
Alibaba-NLP/gte-Qwen2-1.5B-instructQ4
Fits comfortably3GB required · 24GB available
149.48 tok/sEstimated
mistralai/Mistral-Small-Instruct-2409Q4
Fits comfortably11GB required · 24GB available
75.77 tok/sEstimated
mistralai/Mistral-Small-Instruct-2409Q8
Fits (tight)23GB required · 24GB available
57.28 tok/sEstimated
google/gemma-2-27b-itQ4
Fits comfortably14GB required · 24GB available
77.91 tok/sEstimated
google/gemma-2-27b-itQ8
Not supported28GB required · 24GB available
58.13 tok/sEstimated
black-forest-labs/FLUX.2-devFP16
Fits comfortably16GB required · 24GB available
60.22 tok/sEstimated
OpenPipe/Qwen3-14B-InstructQ8
Fits comfortably14GB required · 24GB available
82.40 tok/sEstimated
OpenPipe/Qwen3-14B-InstructFP16
Not supported29GB required · 24GB available
38.62 tok/sEstimated
openai/gpt-oss-120bQ4
Not supported59GB required · 24GB available
27.67 tok/sEstimated
Qwen/Qwen3-4BQ4
Fits comfortably2GB required · 24GB available
132.22 tok/sEstimated
OpenPipe/Qwen3-14B-InstructQ4
Fits comfortably7GB required · 24GB available
98.86 tok/sEstimated
openai-community/gpt2-xlQ4
Fits comfortably4GB required · 24GB available
135.22 tok/sEstimated
meta-llama/Llama-3.2-3B-InstructQ8
Fits comfortably3GB required · 24GB available
117.75 tok/sEstimated
deepseek-ai/DeepSeek-Coder-V2-Lite-InstructQ4
Fits comfortably4GB required · 24GB available
144.19 tok/sEstimated
facebook/opt-125mQ8
Fits comfortably7GB required · 24GB available
100.64 tok/sEstimated
kaitchup/Phi-3-mini-4k-instruct-gptq-4bitQ4
Fits comfortably2GB required · 24GB available
156.86 tok/sEstimated
kaitchup/Phi-3-mini-4k-instruct-gptq-4bitFP16
Fits comfortably9GB required · 24GB available
59.08 tok/sEstimated
Qwen/Qwen2.5-1.5BFP16
Fits comfortably11GB required · 24GB available
54.08 tok/sEstimated
Qwen/Qwen2.5-14B-InstructQ4
Fits comfortably7GB required · 24GB available
102.01 tok/sEstimated
Qwen/Qwen2.5-14B-InstructQ8
Fits comfortably14GB required · 24GB available
70.33 tok/sEstimated
meta-llama/Llama-3.3-70B-InstructFP16
Not supported137GB required · 24GB available
20.73 tok/sEstimated
Qwen/Qwen3-Embedding-8BQ4
Fits comfortably4GB required · 24GB available
150.33 tok/sEstimated
Qwen/Qwen3-Embedding-8BQ8
Fits comfortably9GB required · 24GB available
102.05 tok/sEstimated
Qwen/Qwen3-14BQ8
Fits comfortably14GB required · 24GB available
74.21 tok/sEstimated
Qwen/Qwen3-14BFP16
Not supported29GB required · 24GB available
38.58 tok/sEstimated
deepseek-ai/DeepSeek-R1-Distill-Qwen-7BQ4
Fits comfortably4GB required · 24GB available
152.24 tok/sEstimated
deepseek-ai/DeepSeek-R1-Distill-Qwen-7BQ8
Fits comfortably7GB required · 24GB available
105.47 tok/sEstimated
meta-llama/Llama-2-7b-hfFP16
Fits comfortably15GB required · 24GB available
57.52 tok/sEstimated
deepseek-ai/DeepSeek-R1-Distill-Llama-8BQ4
Fits comfortably4GB required · 24GB available
142.19 tok/sEstimated
deepseek-ai/DeepSeek-R1-Distill-Llama-8BQ8
Fits comfortably9GB required · 24GB available
109.91 tok/sEstimated
Qwen/Qwen2-0.5BQ4
Fits comfortably3GB required · 24GB available
139.70 tok/sEstimated
Qwen/Qwen2-0.5BQ8
Fits comfortably5GB required · 24GB available
101.81 tok/sEstimated
deepseek-ai/deepseek-coder-1.3b-instructQ4
Fits comfortably2GB required · 24GB available
159.48 tok/sEstimated
deepseek-ai/deepseek-coder-1.3b-instructQ8
Fits comfortably3GB required · 24GB available
121.88 tok/sEstimated
Qwen/Qwen3-4B-Thinking-2507FP16
Fits comfortably9GB required · 24GB available
59.02 tok/sEstimated

Note: Performance estimates are calculated. Real results may vary. Methodology · Submit real data

GPU FAQs

Data-backed answers pulled from community benchmarks, manufacturer specs, and live pricing.

How does Vulkan compare to ROCm on RX 7900 XTX?

On qwen3-30B Q4, Vulkan decode hits ~117 tok/sec once a 32K context fills, while ROCm drops to ~12 tok/sec—making Vulkan the faster option for long prompts.

Source: Reddit – /r/LocalLLaMA (mrdpho0)

What prompt-prefill speeds can Vulkan deliver?

The same benchmarks show Vulkan prompt prefill at ~486 tok/s on Windows drivers versus ~432 tok/s on ROCm, highlighting the driver advantage.

Source: Reddit – /r/LocalLLaMA (mrdpho0)

Can AMD hardware host 70B Q8 models?

Yes. Builders highlight Ryzen AI 395 mini-PCs with RX 7900-class GPUs that can load 70B Q8 contexts—something 24 GB NVIDIA cards can’t do—though throughput is slower.

Source: Reddit – /r/LocalLLaMA (mqupq0a)

Does FlashAttention accelerate the 7900 XTX?

Not yet—FlashAttention under Vulkan falls back to the CPU on 7900 XTX, so enabling it doesn’t improve throughput the way it does on NVIDIA cards.

Source: Reddit – /r/LocalLLaMA (mrdpho0)

What are the specs and price snapshot?

RX 7900 XTX offers 24 GB GDDR6 and a 355 W TBP. As of Nov 2025 Amazon listed it at $899 in stock.

Source: TechPowerUp – Radeon RX 7900 XTX Specs

Alternative GPUs

RX 7900 XT
20GB

Explore how RX 7900 XT stacks up for local inference workloads.

RX 6900 XT
16GB

Explore how RX 6900 XT stacks up for local inference workloads.

RTX 4090
24GB

Explore how RTX 4090 stacks up for local inference workloads.

RTX 4080
16GB

Explore how RTX 4080 stacks up for local inference workloads.

RTX 4070 Ti
12GB

Explore how RTX 4070 Ti stacks up for local inference workloads.

Compare RX 7900 XTX

RX 7900 XTX vs RTX 4090

Side-by-side VRAM, throughput, efficiency, and pricing benchmarks for both GPUs.

RX 7900 XTX vs RTX 4080

Side-by-side VRAM, throughput, efficiency, and pricing benchmarks for both GPUs.

RX 7900 XTX vs RX 7900 XT

Side-by-side VRAM, throughput, efficiency, and pricing benchmarks for both GPUs.