Skip to content

<regex>: Avoid generating empty groups when parsing ? quantifiers#6267

Merged
StephanTLavavej merged 1 commit into
microsoft:mainfrom
muellerj2:regex-avoid-empty-groups-for-zero-one-quantifiers
May 15, 2026
Merged

<regex>: Avoid generating empty groups when parsing ? quantifiers#6267
StephanTLavavej merged 1 commit into
microsoft:mainfrom
muellerj2:regex-avoid-empty-groups-for-zero-one-quantifiers

Conversation

@muellerj2

Copy link
Copy Markdown

Towards #5962. The parser currently generates empty groups when it represents zero-one quantifiers like ? by a _N_if node with an empty alternative, just as it did for empty alternatives in general before #6249. This removes this empty group from the generated NFA because it's completely unnecessary.

Drive-by change: Replace the first swap() call by explicit assignments, and qualify the other swap() call because the nodes are STL-internal types, so ADL is not needed here.

Copilot AI review requested due to automatic review settings April 30, 2026 21:17
@muellerj2 muellerj2 requested a review from a team as a code owner April 30, 2026 21:17
@github-project-automation github-project-automation Bot moved this to Initial Review in STL Code Reviews Apr 30, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Removes unnecessary empty-group nodes from the generated NFA for ? quantifiers (including lazy ??), and adds a regression test to lock in the intended behavior.

Changes:

  • Simplify NFA construction for ? by linking the empty alternative directly to _End instead of emitting an empty group.
  • Adjust the non-greedy (??) pointer rewiring logic, using explicit assignments and _STD swap.
  • Add GH-6267 tests for a? vs a?? match/search behavior.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
tests/std/tests/VSO_0000000_regex_use/test.cpp Adds GH-6267 regression coverage for greedy vs lazy optional quantifiers.
stl/inc/regex Removes empty-group nodes for ? and tweaks non-greedy rewiring logic accordingly.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread stl/inc/regex
Comment thread tests/std/tests/VSO_0000000_regex_use/test.cpp
@StephanTLavavej StephanTLavavej added performance Must go faster regex meow is a substring of homeowner labels Apr 30, 2026
@StephanTLavavej StephanTLavavej self-assigned this May 8, 2026
@StephanTLavavej StephanTLavavej removed their assignment May 14, 2026
@StephanTLavavej StephanTLavavej moved this from Initial Review to Ready To Merge in STL Code Reviews May 14, 2026
@StephanTLavavej StephanTLavavej moved this from Ready To Merge to Merging in STL Code Reviews May 14, 2026
@StephanTLavavej

Copy link
Copy Markdown
Member

I'm mirroring this to the MSVC-internal repo. Please notify me if any further changes are pushed, otherwise no action is required.

@StephanTLavavej StephanTLavavej merged commit 6b3afcc into microsoft:main May 15, 2026
52 of 53 checks passed
@github-project-automation github-project-automation Bot moved this from Merging to Done in STL Code Reviews May 15, 2026
@StephanTLavavej

Copy link
Copy Markdown
Member

Thanks for continuing to find ways to improve <regex>! 💚 😻 💝

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance Must go faster regex meow is a substring of homeowner

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

4 participants