Skip to content

fix(autonomous-dev): resume skips post-approval "Review findings:" — approved-PR short-circuit + brittle marker match #188

@zxkane

Description

@zxkane

Summary

On dev-resume, the autonomous-dev agent declares "nothing outstanding" and exits when the linked PR is already APPROVED + CI-green + mergeable, even when a new Review findings: comment with BLOCKING issues was posted to the issue after the approval — so late-arriving review findings are silently skipped.

Steps to Reproduce

  1. Run a feature issue through the pipeline to pending-review; the review fleet (e.g. agy + codex) approves and the PR reaches APPROVED + mergeable. With no-auto-close, the pipeline stops before merge.
  2. After the approval, an operator (or a late/independent review) posts a new comment to the issue beginning with Review findings: that contains BLOCKING (P1) issues, and moves the issue back to pending-dev.
  3. The dispatcher dispatches dev-resume. The agent runs Resume Awareness.

Expected Behavior

The resume should treat the post-approval Review findings: comment as outstanding work: read it, address the BLOCKING items, and NOT exit early — regardless of the PR's current APPROVED/mergeable state. Resume Awareness (autonomous-mode.md) already lists "Read review feedback from issue comments" as step 5; that step should govern the exit decision.

Actual Behavior

The agent posts a "Resume check — nothing outstanding to address" comment citing: review = APPROVED (no inline comments / no change requests), CI = green, PR = MERGEABLE, then exits with no code changes. The standing APPROVED reviewDecision short-circuits the resume before the Review findings: issue comment is acted on. The operator had to manually dismiss the stale approval AND re-post the findings with the exact Review findings: marker to get the next resume to pick them up.

Environment

  • Stage: autonomous pipeline (local execution backend), GH_AUTH_MODE=app
  • Relevant: skills/autonomous-dev/references/autonomous-mode.md "Resume Awareness" (step 5: "Read review feedback from issue comments -- look for 'Review findings:' comments")

Severity

Degraded — blocking review findings can be silently dropped on an approved PR, so a PR with known P1 defects can sit in approved/awaiting-merge looking clean. Data-correctness P1s reached the awaiting-merge state in the observed case.

Possible Cause

Two compounding gaps in the resume decision:

  1. Approved-PR short-circuit. The resume treats reviewDecision == APPROVED + green CI + mergeable as terminal ("nothing outstanding") and exits before fully evaluating issue comments. A newer Review findings: comment posted after the approval timestamp is not weighed against the approval.
  2. Brittle marker matching. Resume Awareness keys off the literal string Review findings:. A findings comment that does not start with that exact marker (e.g. a heading like ## Codex review findings) is not recognized as actionable, even though it is clearly a request for changes. The contract depends on an exact magic string with no fallback.

Suggested direction:

  • On resume, compare the newest Review findings: (or change-request) comment timestamp against the latest approval timestamp; if findings are newer than the approval, do NOT treat the PR as done — even if reviewDecision == APPROVED.
  • Broaden recognition beyond the exact Review findings: prefix (e.g. also match a failed-substantive verdict marker, a [P1]/BLOCKING token, or a configurable set), OR document the exact-marker requirement prominently so operators post the right string.
  • Optionally: when findings are detected post-approval, the resume could dismiss the stale approval itself so the PR's reviewDecision reflects reality.

Testing Requirements

Mandatory: The dev agent MUST create tests that prevent regression of this bug.

Unit Tests

  • Add a regression test for the resume decision: given a PR with a standing APPROVED review at time T1 AND an issue comment Review findings: with a [P1] at time T2 > T1, the resume MUST classify the issue as having outstanding work (not "nothing outstanding").
  • Test the inverse: APPROVED at T1, no newer findings comment → resume correctly treats as done (no false-positive re-work).
  • Test marker recognition: a findings/change-request comment that does NOT start with the exact Review findings: prefix but carries a BLOCKING/[P1] token is still recognized (or, if exact-marker is kept by design, assert the documented contract and that non-matching comments are explicitly logged as ignored).
  • The regression test must fail before the fix and pass after.

Acceptance Criteria

  • A dev-resume whose PR is APPROVED + mergeable but which has a newer post-approval Review findings: (BLOCKING) comment proceeds to address the findings instead of exiting "nothing outstanding".
  • Approval-timestamp vs findings-timestamp ordering governs the resume's done/not-done decision.
  • Findings recognition no longer silently depends on a single exact magic string with no fallback (either broadened matching or a loud "ignored unrecognized comment" log + documented contract).
  • No regression: a genuinely-done approved PR with no newer findings still resumes to "nothing outstanding".

Dependencies

Metadata

Metadata

Assignees

No one assigned

    Labels

    approvedReview passed, PR merged or awaiting manual mergeautonomousbugSomething isn't workingno-auto-closeSkip auto-merge after review passes, requires manual approval

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions