Can RTX 4090 run Qwen/Qwen3-Next-80B-A3B-Instruct?

Q4 not recommended24GB VRAM availableRequires 39GB+

RTX 4090 does not meet the minimum VRAM requirement for Q4 inference of Qwen/Qwen3-Next-80B-A3B-Instruct. Review the quantization breakdown below to see how higher precision settings impact VRAM and throughput.

What this means for you

RTX 4090 lacks sufficient VRAM for comfortable Qwen/Qwen3-Next-80B-A3B-Instruct operation with Q4 quantization.

Your 24GB GPU is 15GB short of the 39GB minimum.

Options: (1) Try Q2 or Q3 quantization for lower VRAM requirements, (2) Consider cloud GPU rental, (3) Upgrade to a GPU with at least 16GB VRAM.

Quantization breakdown

Quantization	VRAM needed	VRAM available	Estimated speed	Verdict
Q4	39GB	24GB	36.40 tok/s	❌ Not recommended
Q8	78GB	24GB	27.37 tok/s	❌ Not recommended
FP16	156GB	24GB	13.50 tok/s	❌ Not recommended

Best current price

RTX 4090

$1,599.00 on Amazon

Check Price

Suitable alternatives

AMD Instinct MI300X

192GB

167.62 tok/s

Price: —

NVIDIA H200 SXM 141GB

141GB

139.18 tok/s

Price: —

AMD Instinct MI300X

192GB

117.41 tok/s

Price: —

NVIDIA H200 SXM 141GB

141GB

97.60 tok/s

Price: —

NVIDIA H100 SXM5 80GB

80GB

97.49 tok/s

Price: —

Can RTX 4090 run Qwen/Qwen3-Next-80B-A3B-Instruct?

What this means for you

Quantization breakdown

Best current price

Suitable alternatives

More questions

Can RTX 4090 run Qwen/Qwen3-Next-80B-A3B-Instruct?

What this means for you

Quantization breakdown

Best current price

Suitable alternatives

More questions