Skip to content

feat(backends): add Moonshine streaming STT backend#2115

Open
Theohox wants to merge 1 commit into
lemonade-sdk:mainfrom
Theohox:moonshine-streaming-backend
Open

feat(backends): add Moonshine streaming STT backend#2115
Theohox wants to merge 1 commit into
lemonade-sdk:mainfrom
Theohox:moonshine-streaming-backend

Conversation

@Theohox
Copy link
Copy Markdown

@Theohox Theohox commented Jun 5, 2026

Adds a new 'moonshine' recipe for Moonshine speech-to-text using the moonshine_voice C++ streaming API. Integrates with existing OpenAI-compatible /v1/audio/transcriptions and WebSocket /realtime endpoints.

Includes:

  • MoonshineServer backend (HTTP /inference, TCP streaming)
  • IStreamingTranscriptionServer interface for generic streaming backends
  • TcpJsonlClient for line-delimited JSON over TCP
  • RealtimeSessionManager integration for WebSocket/audio forwarding
  • ModelManager support for moonshine cache resolution and download
  • CPU-only, cross-platform (Linux/Windows x86_64)
  • End-to-end smoke test

The existing Whisper path is unchanged.

Requesting @bitgamma to review per discord comments

Adds a new 'moonshine' recipe for Moonshine speech-to-text using the
moonshine_voice C++ streaming API. Integrates with existing OpenAI-compatible
/v1/audio/transcriptions and WebSocket /realtime endpoints.

Includes:
- MoonshineServer backend (HTTP /inference, TCP streaming)
- IStreamingTranscriptionServer interface for generic streaming backends
- TcpJsonlClient for line-delimited JSON over TCP
- RealtimeSessionManager integration for WebSocket/audio forwarding
- ModelManager support for moonshine cache resolution and download
- CPU-only, cross-platform (Linux/Windows x86_64)
- End-to-end smoke test

The existing Whisper path is unchanged.
@bitgamma bitgamma self-requested a review June 5, 2026 19:54
Copy link
Copy Markdown
Member

@bitgamma bitgamma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your PR!

I see the current implementation is relying on system-wide Python installation, which is not something we can rely on. Generally, we try to avoid Python-based backends for this reason. The only Python-based backend we have is vLLM but it has been packaged to be self-contained and not rely (or interfere with) the system Python. Similar work must be done before this can be considered viable.

Additionally, I see models are not being downloaded from huggingface. Adding another download repository is something we also want to avoid. I'd much prefer if you could find or re-upload the models on HF instead. I also see voices are being downloaded in the user folder, outside the usual lemonade model folder - this is another thing we want to avoid.

In general, this needs to be reworked to follow the same patterns as the other backends.

@Theohox
Copy link
Copy Markdown
Author

Theohox commented Jun 6, 2026

Thank you!

@github-actions github-actions Bot added engine::whispercpp whisper.cpp backend; audio transcription area::api HTTP REST API surface and route handlers audio enhancement New feature or request labels Jun 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area::api HTTP REST API surface and route handlers audio engine::whispercpp whisper.cpp backend; audio transcription enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants