Run Google's open-weight LLM on your hardware
Gemma is Google's family of open-weight language models. Gemma 2 offers excellent performance for its size, rivaling larger models. This guide shows you how to run Gemma locally using Jan.
Jan is a free desktop app with a built-in Model Hub featuring Gemma.
# Download from: https://jan.ai/download
# Available for Windows, macOS, Linux
# Install and launch the appš” Jan auto-detects your GPU and configures optimal settings.
Open Jan and click Model Hub. Search for 'Gemma'.
Available Gemma models:
⢠Gemma 2 2B - Ultra fast, runs on any GPU
⢠Gemma 2 9B - Great balance of speed and quality
⢠Gemma 2 27B - Best quality, needs 16GB+ VRAMClick download on your chosen model, then start a new chat.
Gemma 2 9B download: ~5GB
Gemma 2 27B download: ~15GB
Once downloaded, click the model to start chatting!ā Model runs on CPU instead of GPU
ā Go to Settings > Advanced and verify GPU acceleration is enabled. Update your GPU drivers.
ā Gemma 27B is slow
ā Gemma 27B needs 16GB+ VRAM for good speeds. Try Gemma 9B for faster responses on 12GB cards.