Skip to content

fix(tts): prevent double [pause] in xAI auto speech tags (#29417)#32237

Merged
teknium1 merged 2 commits into
mainfrom
fix/29417-tts-double-pause
May 25, 2026
Merged

fix(tts): prevent double [pause] in xAI auto speech tags (#29417)#32237
teknium1 merged 2 commits into
mainfrom
fix/29417-tts-double-pause

Conversation

@teknium1

Copy link
Copy Markdown
Contributor

Summary

Multi-paragraph xAI TTS input no longer produces a double [pause] in the audio.

Salvages PR #29417 (@EloquentBrush0x) + adds regression coverage.

Root cause

_apply_xai_auto_speech_tags ran two pause-insertion passes unconditionally:

  1. Paragraph breaks (\n\n) → " [pause] "
  2. First-sentence boundary → " [pause] "

The tag-detection guard at the top of the function only checked the original input. After step 1 injected [pause], step 2 fired again on the now-mutated text, producing "...sentence. [pause] [pause] Next paragraph...".

Fix

Re-check the tag guard after the paragraph pass. If [pause] was already injected, skip the first-sentence substitution.

clean = re.sub(r"\n\s*\n+", " [pause] ", clean)
clean = re.sub(r"\s*\n\s*", " ", clean)
if not _XAI_SPEECH_TAG_RE.search(clean):   # ← new guard
    clean = _XAI_FIRST_SENTENCE_RE.sub(r"\1 [pause] ", clean, count=1)

Repro caveat

The PR's quoted repro ("Hello world.\n\nSecond paragraph.") doesn't actually trip the bug — _XAI_FIRST_SENTENCE_RE has a 12-char length floor (r"^(.{12,120}?[.!?…])\s+(?=\S)") and "Hello world." is exactly 12 chars total, so the regex can't satisfy the [.!?…] terminator after .{12}. The bug is real but needs a first sentence with at least one character beyond the terminator. Test added below uses "Welcome to the demo of our new product line.\n\nIt has many features." which DOES reproduce.

Validation

3 new tests in tests/tools/test_tts_xai_speech_tags.py:

Test Guards against
multi_paragraph_emits_single_pause The bug returning
single_paragraph_still_gets_first_sentence_pause The fix being too aggressive (no [pause] from paragraph pass → first-sentence must still fire)
single_newline_still_gets_first_sentence_pause Confusing \n with \n\n
scripts/run_tests.sh tests/tools/test_tts_xai_speech_tags.py
=== 8 tests passed, 0 failed in 0.4s ===

Credit

@EloquentBrush0x did the diagnosis and shipped the one-line fix. Follow-up is just the regression tests (their PR included a "test plan" checklist but no test code).

EloquentBrush0x and others added 2 commits May 25, 2026 13:33
…ragraph text

_apply_xai_auto_speech_tags runs two independent transformations:
  1. paragraph breaks (\n\n) → " [pause] "
  2. first-sentence boundary → " [pause] "

Both fired unconditionally, so multi-paragraph input produced
"Hello world. [pause] [pause] Second paragraph." — an unnatural
double pause in the TTS audio.

Guard the first-sentence substitution with _XAI_SPEECH_TAG_RE.search(clean):
if the paragraph pass already inserted a [pause] tag, skip the
first-sentence pass. Single-paragraph behavior is unchanged.
Three new tests in tests/tools/test_tts_xai_speech_tags.py:

- multi_paragraph_emits_single_pause — the headline #29417 case.
  Requires a first sentence of 12+ chars to hit the
  _XAI_FIRST_SENTENCE_RE length floor; the trivial 'Hello.\\n\\nWorld.'
  case dodged the bug by accident, which is why the PR's quoted
  repro didn't reproduce.  Uses the longer 'Welcome to the demo of
  our new product line.\\n\\nIt has many features.' shape that
  actually trips the bug.
- single_paragraph_still_gets_first_sentence_pause — sanity guard
  that the fix only suppresses the first-sentence pass when a
  paragraph pass injected [pause], so plain single-paragraph input
  still gets its leading pause.
- single_newline_still_gets_first_sentence_pause — single newline
  isn't a paragraph break, no [pause] from the paragraph pass, so
  the first-sentence pause MUST still fire.  Catches over-broad
  fixes.
@github-actions

Copy link
Copy Markdown
Contributor

🔎 Lint report: fix/29417-tts-double-pause vs origin/main

ruff

Total: 0 on HEAD, 0 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 0 pre-existing issues carried over.

ty (type checker)

Total: 9347 on HEAD, 9347 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 4946 pre-existing issues carried over.

Diagnostics are surfaced as warnings — this check never fails the build.

@alt-glitch alt-glitch added type/bug Something isn't working P3 Low — cosmetic, nice to have provider/xai xAI (Grok) tool/tts Text-to-speech and transcription labels May 25, 2026
@teknium1 teknium1 merged commit 5caeb65 into main May 25, 2026
26 checks passed
@teknium1 teknium1 deleted the fix/29417-tts-double-pause branch May 25, 2026 21:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

P3 Low — cosmetic, nice to have provider/xai xAI (Grok) tool/tts Text-to-speech and transcription type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants