fix(tts): prevent double [pause] in xAI auto speech tags (#29417)#32237
Merged
Conversation
…ragraph text _apply_xai_auto_speech_tags runs two independent transformations: 1. paragraph breaks (\n\n) → " [pause] " 2. first-sentence boundary → " [pause] " Both fired unconditionally, so multi-paragraph input produced "Hello world. [pause] [pause] Second paragraph." — an unnatural double pause in the TTS audio. Guard the first-sentence substitution with _XAI_SPEECH_TAG_RE.search(clean): if the paragraph pass already inserted a [pause] tag, skip the first-sentence pass. Single-paragraph behavior is unchanged.
Three new tests in tests/tools/test_tts_xai_speech_tags.py: - multi_paragraph_emits_single_pause — the headline #29417 case. Requires a first sentence of 12+ chars to hit the _XAI_FIRST_SENTENCE_RE length floor; the trivial 'Hello.\\n\\nWorld.' case dodged the bug by accident, which is why the PR's quoted repro didn't reproduce. Uses the longer 'Welcome to the demo of our new product line.\\n\\nIt has many features.' shape that actually trips the bug. - single_paragraph_still_gets_first_sentence_pause — sanity guard that the fix only suppresses the first-sentence pass when a paragraph pass injected [pause], so plain single-paragraph input still gets its leading pause. - single_newline_still_gets_first_sentence_pause — single newline isn't a paragraph break, no [pause] from the paragraph pass, so the first-sentence pause MUST still fire. Catches over-broad fixes.
Contributor
🔎 Lint report:
|
5 tasks
1 task
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Multi-paragraph xAI TTS input no longer produces a double
[pause]in the audio.Salvages PR #29417 (@EloquentBrush0x) + adds regression coverage.
Root cause
_apply_xai_auto_speech_tagsran two pause-insertion passes unconditionally:\n\n) →" [pause] "" [pause] "The tag-detection guard at the top of the function only checked the original input. After step 1 injected
[pause], step 2 fired again on the now-mutated text, producing"...sentence. [pause] [pause] Next paragraph...".Fix
Re-check the tag guard after the paragraph pass. If
[pause]was already injected, skip the first-sentence substitution.Repro caveat
The PR's quoted repro (
"Hello world.\n\nSecond paragraph.") doesn't actually trip the bug —_XAI_FIRST_SENTENCE_REhas a 12-char length floor (r"^(.{12,120}?[.!?…])\s+(?=\S)") and"Hello world."is exactly 12 chars total, so the regex can't satisfy the[.!?…]terminator after.{12}. The bug is real but needs a first sentence with at least one character beyond the terminator. Test added below uses"Welcome to the demo of our new product line.\n\nIt has many features."which DOES reproduce.Validation
3 new tests in
tests/tools/test_tts_xai_speech_tags.py:multi_paragraph_emits_single_pausesingle_paragraph_still_gets_first_sentence_pause[pause]from paragraph pass → first-sentence must still fire)single_newline_still_gets_first_sentence_pause\nwith\n\nCredit
@EloquentBrush0x did the diagnosis and shipped the one-line fix. Follow-up is just the regression tests (their PR included a "test plan" checklist but no test code).