Skip to content

Handle max-iteration stops as tool_limit_reached instead of rendering synthetic summary prompts as user messages #3821

@franksong2702

Description

@franksong2702

Problem

When a WebUI-started Hermes Agent run reaches the max tool-calling / iteration budget, the turn can be rendered as if it completed normally, and the Agent's internal summary request can appear in the transcript as a user-authored message.

That summary request is not user input. It is a runtime control prompt used by the Agent to ask the model for a final toolless summary after the iteration budget is exhausted.

This makes the UI misleading in two ways:

  1. The turn can look like a normal completed assistant response even though it stopped because the tool/iteration budget was exhausted.
  2. The transcript can show an internal control instruction as a normal user message.

Current Code Facts

WebUI already passes configured turn budget into Agent instances:

  • api/streaming.py reads agent.max_turns / root max_turns.
  • api/streaming.py passes max_iterations to AIAgent when supported.
  • api/streaming.py wires status_callback when supported.

Hermes Agent currently exposes useful diagnostic metadata:

  • agent/conversation_loop.py sets turn_exit_reason = "max_iterations_reached(...)".
  • The Agent result includes messages, completed, turn_exit_reason, and budget diagnostics.

Hermes Agent also currently constructs the max-iteration summary request as a synthetic user message:

  • agent/chat_completion_helpers.py
  • handle_max_iterations()
  • appends:
messages.append({"role": "user", "content": summary_request})

where summary_request starts with:

You've reached the maximum number of tool-calling iterations allowed...

The Live-to-Final RFC already names the relevant product states:

  • tool_limit_reached
  • no_response

and says a turn without a final answer must not look like normal completed.

Expected Behavior

When a run ends because max iterations / iteration budget was reached:

  1. WebUI should classify the turn as tool_limit_reached.
  2. If there is usable assistant final content, WebUI should keep it as the Final Answer.
  3. The Final Answer should show clear status/copy indicating the run stopped because the tool iteration limit was reached.
  4. If there is no usable assistant final content, WebUI should show an explicit terminal state, such as tool_limit_reached / no final answer.
  5. WebUI should not render or persist the synthetic summary request as a user-authored message.
  6. Replay/reconnect should restore the same Worklog, Final Answer if present, and terminal metadata.

Suggested Compatibility Fix

At WebUI settle/persistence/replay boundaries:

  • Detect max-iteration stops via returned Agent metadata:
    • completed == false
    • turn_exit_reason starts with max_iterations_reached
    • or equivalent status callback / diagnostic event
  • Set terminal metadata:
    • terminal_state = "tool_limit_reached"
    • terminal_reason = "max_iterations"
    • has_final_answer = true/false
    • optionally budget_used / budget_max
  • Filter known synthetic max-iteration summary prompts from user-visible transcript rendering and durable user-authored transcript.
  • Preserve the assistant summary as Final Answer only if it is meaningful user-facing content.
  • Ensure replay/reconnect uses the same terminal metadata rather than inferring normal completion from the presence of assistant text.

Non-Goals

  • Do not redesign the whole Live-to-Final system.
  • Do not fold this into Auto Compression behavior.
  • Do not require an upstream Hermes Agent fix before WebUI can behave correctly.
  • Do not hide real user-authored messages that happen to mention max iterations.
  • Do not remove the Agent's final toolless summary behavior if it produces a useful answer.

Upstream Context

The upstream Agent root issue is tracked in NousResearch/hermes-agent#36239. WebUI should still defensively handle current Agent behavior, because older Agent builds can keep returning the synthetic user-role summary prompt for some time.

Acceptance Criteria

  • A WebUI run that reaches max iterations does not show the synthetic summary request as a user bubble.
  • A run with a usable assistant summary displays that summary as Final Answer plus a clear tool_limit_reached status.
  • A run without usable final assistant content displays an explicit no-final terminal state.
  • Reload/reconnect preserves the same terminal state and final-answer behavior.
  • Tests cover both:
    • final answer present after max-iteration summary;
    • no usable final answer after max-iteration stop.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingsprint-candidateStrong candidate for next sprintstreamingSSE streaming, gateway sync, real-time updates

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions