RealtimeSTT is a Python-based realtime speech-to-text engine emphasizing low latency, wake-word detection, voice activity detection, and automatic speech segmentation. It provides asynchronous callbacks, nanosecond-precision timestamps, and CLI tools, suitable for building voice assistants, meeting transcribers, or live caption systems.
Features
- Real-time transcription via microphone
- Wake-word and voice-activity detection
- Asynchronous callback architecture
- Nanosecond timing metadata
- CLI and server modes with VAD filters
- Low-latency suitable for live apps
