HyperWhisper supports four transcription tiers out of the box via HyperWhisper Cloud, plus bring-your-own-key (BYOK) for direct provider access and fully offline local models.Documentation Index
Fetch the complete documentation index at: https://hyperwhisper.com/docs/llms.txt
Use this file to discover all available pages before exploring further.
HyperWhisper Cloud at a glance
HyperWhisper Cloud is built-in — no API key, no separate account. Pick a tier based on whether you care more about speed, balance, or accuracy. All four are pay-as-you-go with no markup; you pay what the underlying provider charges.Medium
Groq Whisper Large v3~$0.11 / hour
1.85 credits/minSub-second latency. Great for English and larger European languages.
Medium
Deepgram Nova-3~$0.33 / hour
5.5 credits/minStrong English accuracy, low latency, supports custom vocabulary.
High
ElevenLabs Scribe v2~$0.59 / hour
9.83 credits/minHigh-accuracy multilingual transcription, retained as the Scribe v2 tier.
Highest
Grok STT$0.10 / hour
1.67 credits/minHighest HyperWhisper Cloud accuracy, powered by xAI.
Credits are billed at 1 credit = $0.001 USD. A Pro license includes 5,000 credits up front, and top-ups are available in $5 / $10 / $20 bundles.
You only pay for actual speech
HyperWhisper Cloud detects silence and blank audio automatically. If a recording contains no detectable speech, you are charged 0 credits — we don’t bill for dead air at the start of a clip, pauses between thoughts, or an accidentally-triggered empty recording. In practice, across a typical working day of push-to-talk dictation, you’re only billed for the minutes you actually spoke.Accuracy by language
For English, any tier works. For the best HyperWhisper Cloud quality, use Highest (Grok SST), which is the default for new installs. High (ElevenLabs Scribe v2) remains available as the high-accuracy Scribe tier.xAI does not publish a per-language WER table for Grok STT. We avoid mixing older third-party benchmark numbers with this tier because they do not measure the same provider.
Cost examples
At 1 credit = $0.001 USD, here’s what each tier costs at typical usage levels. Remember: only actual speech is billed, so “30 min/day” means 30 minutes of talking, not 30 minutes of the app being open.| Daily speech | Medium (Groq) | Medium (Deepgram) | High | Highest |
|---|---|---|---|---|
| 15 min | ~$0.03 | ~$0.08 | ~$0.15 | ~$0.03 |
| 30 min | ~$0.06 | ~$0.17 | ~$0.30 | ~$0.05 |
| 1 hour | ~$0.11 | ~$0.33 | ~$0.59 | ~$0.10 |
| 2 hours | ~$0.22 | ~$0.66 | ~$1.18 | ~$0.20 |
| 8 hours | ~$0.89 | ~$2.64 | ~$4.72 | ~$0.80 |
Alternatives
- Bring Your Own Key
- Local / offline
If you already have API credits or want to use your own free tier (Deepgram $200, AssemblyAI $50), plug in a key via API Keys. You pay the provider directly at their published rate.
HyperWhisper also supports Fireworks AI, Mistral, and Google Gemini for BYOK. See API Keys for setup.
| Provider | Model | $/min |
|---|---|---|
| Groq | Whisper Large v3 Turbo | $0.00067 |
| xAI | Grok STT | $0.00167 |
| Deepgram | Nova-3 (batch) | $0.0043 |
| AssemblyAI | Universal | $0.0037 |
| OpenAI | whisper-1 / gpt-4o-transcribe | $0.006 |
| ElevenLabs | Scribe v2 | ~$0.008 |
When you bring your own key, opting your audio out of model training is your responsibility — every provider has its own dashboard setting. See Data Privacy & Model Training for a copy-pasteable LLM prompt that finds the current opt-out for any provider.
Boost accuracy on any provider
- Custom vocabulary — add domain terms (product names, frameworks, jargon, colleagues’ names). Biggest single improvement for technical or professional use.
- Low-noise environment — every model degrades with background noise. See Best Practices.
- Natural pace — overly fast or overly slow speech both hurt accuracy.
When using Deepgram Nova-3 with custom vocabulary, set the language explicitly (not
auto) — the keyterm parameter is only active in monolingual mode on Nova-3.