10-15 minutesBeginner

How to Run Whisper Locally

Transcribe audio with OpenAI's Whisper

Whisper is OpenAI's speech recognition model that rivals professional transcription services. Run it locally for free, private transcription of any audio or video.

Hardware Requirements

GPU VRAMMin: 4GB (small model)Rec: 8GB (large model)CPU-only works but is 10x slower

System RAMMin: 8GBRec: 16GB

StorageMin: 5GB freeRec: 10GB SSD

Step-by-Step Guide

1Install Whisper

Install OpenAI's Whisper package.

pip install openai-whisper

# For faster inference, also install:
pip install faster-whisper

2Transcribe Audio

One command to transcribe any audio file.

whisper audio.mp3 --model large-v3

# With faster-whisper:
faster-whisper audio.mp3 --model large-v3

💡 First run downloads the model (~3GB for large-v3).

3Choose Your Model Size

Larger models are more accurate but slower.

# tiny:   39M params, ~32x realtime, lower accuracy
# base:   74M params, ~16x realtime
# small:  244M params, ~6x realtime
# medium: 769M params, ~2x realtime
# large:  1550M params, ~1x realtime, best accuracy

whisper audio.mp3 --model medium

Recommended GPUs

Budget

RTX 3060 12GB

Runs large-v3 at realtime speed.

View GPU

Recommended

RTX 4070 Ti Super 16GB

3-4x faster than realtime.

View GPU

Troubleshooting

❓ Transcription is slow

✅ Use faster-whisper which is 4x faster. Or try a smaller model size.

❓ Poor accuracy

✅ Use large-v3 model. Ensure clean audio input.

Related Guides

Run Llama Locally