Can RTX 5090 run Qwen/Qwen3-Next-80B-A3B-Instruct?

Q4 not recommended32GB VRAM availableRequires 39GB+

RTX 5090 does not meet the minimum VRAM requirement for Q4 inference of Qwen/Qwen3-Next-80B-A3B-Instruct. Review the quantization breakdown below to see how higher precision settings impact VRAM and throughput.

What this means for you

RTX 5090 lacks sufficient VRAM for comfortable Qwen/Qwen3-Next-80B-A3B-Instruct operation with Q4 quantization.

Your 32GB GPU is 7GB short of the 39GB minimum.

Options: (1) Try Q2 or Q3 quantization for lower VRAM requirements, (2) Consider cloud GPU rental, (3) Upgrade to a GPU with at least 16GB VRAM.

Quantization breakdown

Quantization	VRAM needed	VRAM available	Estimated speed	Verdict
Q4	39GB	32GB	59.13 tok/s	❌ Not recommended
Q8	78GB	32GB	44.76 tok/s	❌ Not recommended
FP16	156GB	32GB	24.92 tok/s	❌ Not recommended

Suitable alternatives

AMD Instinct MI300X

192GB

167.62 tok/s

Price: —

NVIDIA H200 SXM 141GB

141GB

139.18 tok/s

Price: —

AMD Instinct MI300X

192GB

117.41 tok/s

Price: —

NVIDIA H200 SXM 141GB

141GB

97.60 tok/s

Price: —

NVIDIA H100 SXM5 80GB

80GB

97.49 tok/s

Price: —

Can RTX 5090 run Qwen/Qwen3-Next-80B-A3B-Instruct?

What this means for you

Quantization breakdown

Suitable alternatives

More questions

Can RTX 5090 run Qwen/Qwen3-Next-80B-A3B-Instruct?

What this means for you

Quantization breakdown

Suitable alternatives

More questions