Skip to content

session: add cache-safe summary forking#1932

Open
Rememorio wants to merge 2 commits into
trpc-group:mainfrom
Rememorio:summary_cache_forking
Open

session: add cache-safe summary forking#1932
Rememorio wants to merge 2 commits into
trpc-group:mainfrom
Rememorio:summary_cache_forking

Conversation

@Rememorio

Copy link
Copy Markdown
Collaborator

Summary

  • add opt-in cache-safe summary forking via summary.WithCacheSafeForking(true)
  • clone the parent model request and append a compacting user message for context compaction summaries
  • document the mode in English and Chinese mkdocs session summary docs

Tests

  • go test ./session/summary ./internal/flow/llmflow
  • go test ./runner with real API env unset
  • root package sweep passed with real API env unset, excluding local darwin/Go 1.26 unrelated failures in tool/hostexec and internal/skillstage

@coderabbitai

coderabbitai Bot commented Jun 8, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 109b208d-6d29-419d-a707-c753fd435e13

📥 Commits

Reviewing files that changed from the base of the PR and between 060f2c2 and f837841.

📒 Files selected for processing (1)
  • session/summary/cache_safe_fork_test.go

📝 Walkthrough

Summary

Change Overview

This PR introduces cache-safe summary forking, an opt-in feature that optimizes prompt-cache reuse in session summarization:

  • Core mechanism: enable with WithCacheSafeForking(true). When a parent model request is present in context, the summarizer forks it (by cloning) and appends a compacting user message instead of building a standalone summary request.
  • Deep cloning: adds cloneRequestForCacheSafeFork() and context helpers (ContextWithCacheSafeForkRequest, CacheSafeForkRequestFromContext) to attach and deep-copy a *model.Request (messages, content parts, tool calls, generation config, structured output, extra fields, headers, tools) to avoid shared-mutation.
  • Compaction prompt: supports WithCacheSafeForkPrompt(prompt string); default prompt is provided and custom prompts may include {max_summary_words} but must not include {conversation_text}.
  • Request transformation: forked requests force streaming OFF and clear structured output to ensure independent summary generation while forwarding configured tools.
  • Fallback: if no parent request is in context, summarizer falls back to building a standalone summary request (preserving prior behavior).
  • Docs & tests: English and Chinese mkdocs pages added; unit tests cover cloning, context helpers, summarizer behavior, and integration with internal/flow/llmflow.

Compatibility and Behavioral Risks

  • Opt-in: default behavior unchanged unless WithCacheSafeForking(true) is used.
  • Performance: deep-cloning large requests (long histories, many tools, media content) can add CPU/memory overhead and GC pressure — measure under realistic load.
  • Behavior shifts: forked requests disable streaming and structured output; if downstream logic expects those, behavior may differ.
  • Prompt validation: custom fork prompts must avoid forbidden placeholders (e.g., {conversation_text}); invalid prompts may cause API validation failures.
  • Context dependency: forking only occurs when a parent request is explicitly attached to context; absence results in silent fallback to standalone requests.
  • Mutation safety: cloning is intended to prevent parent mutation; ensure tests and real usage confirm no mutation leaks into original request objects.

Recommended Validation Steps

  1. Prompt-cache verification

    • Enable cache-safe forking against a real API with prompt caching.
    • Confirm successive summaries reuse parent-request prefix cache; measure latency and token usage improvements.
  2. Parent immutability

    • Run and review tests (e.g., TestSessionSummarizer_CacheSafeForking, clone tests) to confirm the original request remains unchanged after forking.
    • Add integration checks that mutate cloned requests and assert originals unchanged.
  3. Fallback behavior

    • Test flows without a parent request in context to confirm graceful fallback to standalone summary requests and equivalence of outputs where applicable.
  4. Prompt configuration

    • Validate WithCacheSafeForkPrompt() rejects prompts containing {conversation_text} and substitutes {max_summary_words} correctly.
    • Test default prompt behavior when custom prompt is empty.
  5. Performance profiling

    • Benchmark cloning overhead and memory impact with representative conversation sizes and tools; compare with baseline (forking disabled).
    • Monitor latency, allocations, and GC under production-like load.
  6. Integration & CI

    • Run unit/integration tests: go test ./session/summary ./internal/flow/llmflow ./runner (as done in PR).
    • Verify internal/flow/llmflow uses the context-attached fork request correctly (tests added) and that higher-level workflows behave as expected.
中文

摘要

功能概览

本 PR 引入可选的“缓存安全摘要 Forking”,以提升会话摘要时的 Prompt 缓存复用:

  • 核心机制:通过 WithCacheSafeForking(true) 启用。若上下文中存在父模型请求,摘要器会 fork(克隆)该请求并在其消息末尾追加压缩用户提示,而非构建独立的摘要请求。
  • 深度克隆:新增 cloneRequestForCacheSafeFork() 以及上下文工具(ContextWithCacheSafeForkRequestCacheSafeForkRequestFromContext),对 *model.Request 的可变子结构(消息、内容部分、工具调用、生成配置、结构化输出、额外字段、头和工具映射)进行递归深拷贝,防止共享可变状态。
  • 压缩提示:通过 WithCacheSafeForkPrompt(prompt string) 设置追加提示;提供默认提示,自定义提示允许 {max_summary_words} 占位符,但禁止 {conversation_text}
  • 请求变换:forked 请求会强制关闭流式输出并清除结构化输出以确保独立执行,同时会保留并转发已配置的工具。
  • 回退行为:若上下文中无父请求,则回退为构建独立摘要请求(保持原有行为)。
  • 文档与测试:新增中英 mkdocs 文档;添加单元测试覆盖克隆、上下文 helper、摘要器行为及 llmflow 集成用例。

兼容性及行为风险

  • 可选开启:默认行为不变,只有启用 WithCacheSafeForking(true) 时才生效。
  • 性能影响:对大型或包含多工具/媒体内容的请求进行深拷贝可能带来显著的 CPU/内存开销与 GC 压力——需在代表性负载下评估。
  • 行为差异:forked 请求禁用流式输出并清除结构化输出;如果下游逻辑依赖这些特性,可能出现不同表现。
  • 提示词校验:自定义提示若包含被禁止的占位符(如 {conversation_text})可能导致 API 校验错误或调用失败。
  • 上下文依赖:仅在通过 ContextWithCacheSafeForkRequest() 明确附加父请求时才触发 Fork;缺失时静默回退为独立摘要。
  • 不可变性假设:实现旨在防止修改父请求;需通过测试和实际使用确认不存在修改回流问题。

推荐验证步骤

  1. Prompt 缓存命中验证

    • 在支持 Prompt 缓存的真实 API 环境中启用该功能。
    • 确认连续摘要请求复用父请求前缀的缓存,并衡量延迟与令牌消耗的改善。
  2. 父请求不变性验证

    • 运行并审查相关测试(如 TestSessionSummarizer_CacheSafeForking、克隆相关测试),确保原始请求在 fork 后未被修改。
    • 在集成测试中修改克隆对象并断言原对象不受影响。
  3. 回退行为验证

    • 测试上下文中无父请求的场景,确认摘要器能优雅回退为独立请求并无错误。
    • 在相同输入下比对独立与 forked 摘要的输出一致性(如适用)。
  4. 提示词配置验证

    • 确认 WithCacheSafeForkPrompt() 在配置或调用时拒绝包含 {conversation_text} 的非法提示,并能正确替换 {max_summary_words}
    • 测试空提示使用默认压缩提示的情况。
  5. 性能评估

    • 在生产级别的示例(典型消息历史与工具集)中测量克隆开销、内存和 GC 影响。
    • 对比启用/禁用该特性时的性能基线,发现潜在回归。
  6. 集成与 CI 验证

    • 运行单元与集成测试:go test ./session/summary ./internal/flow/llmflow ./runner(PR 已包含相应测试执行)。
    • 确认 internal/flow/llmflow 中的上下文压缩流程正确使用附加的 fork 请求并在更高层工作流中表现正常。

Walkthrough

EN: Adds an opt-in cache-safe forking path for session summarization: context helpers to attach a parent *model.Request, deep-clone utilities to fork requests, new summarizer options and prompt plumbing, integration into request building and llmflow context compaction, unit and integration tests, and EN/ZH documentation updates.

ZH: 新增可选的 Cache-Safe 摘要 Forking:上下文挂载父请求、深度克隆工具、摘要器选项与提示词配置、请求构建与 llmflow 集成、测试套件与中/英文文档。

Changes

Cache-Safe Request Forking for Session Summarization

Layer / File(s) Summary
Request cloning and context infrastructure
session/summary/cache_safe_fork.go
Context helpers and deep-clone implementation (ContextWithCacheSafeForkRequest, CacheSafeForkRequestFromContext, cloneRequestForCacheSafeFork) that copy messages, content parts, tool calls, generation config, structured output, extra fields, headers, and tools to avoid shared mutable state.
Summarizer configuration and metadata
session/summary/options.go, session/summary/summarizer.go
Public options WithCacheSafeForking(enable bool) and WithCacheSafeForkPrompt(prompt string); metadata key and summarizer struct fields for fork mode and fork prompt, with validation and default prompt generation.
Summary request building with forking path
session/summary/summarizer.go
buildSummaryRequest(ctx,...) now checks context for parent request when fork enabled, clones parent and appends fork-specific user message (forces non-streaming, clears structured output), otherwise falls back to standalone summary request.
Context propagation in llmflow
internal/flow/llmflow/llmflow.go
Uses sessionsummary.ContextWithCacheSafeForkRequest(ctx, req) when calling CreateSessionSummary during context compaction so parent request is available to the summarizer.
Behavior and unit tests
internal/flow/llmflow/context_compact_test.go, session/summary/cache_safe_fork_test.go, session/summary/summarizer_test.go
Adds tests covering context helpers, deep-clone correctness (deep-copy semantics), summarizer fork-mode request construction and immutability guarantees, llmflow context wiring, and fallback standalone request behavior.
User-facing documentation
docs/mkdocs/en/session/summary.md, docs/mkdocs/zh/session/summary.md
New "Cache-Safe Summary Forking" sections in EN/ZH and entries in the Summary Generation options table documenting WithCacheSafeForking and WithCacheSafeForkPrompt.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes


Possibly related PRs

  • trpc-group/trpc-agent-go#1578: Modifies session summary request/prompt construction; overlaps with message composition logic extended here.
  • trpc-group/trpc-agent-go#1565: Changes around maybeCompactContextBeforeLLM and session summary reconstruction; related to llmflow context compaction wiring.

Suggested reviewers

  • sandyskies
  • hyprh
  • WineChord
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'session: add cache-safe summary forking' directly and clearly describes the main feature added across the changeset.
Description check ✅ Passed The description accurately outlines the key additions: opt-in cache-safe forking via WithCacheSafeForking(true), parent request cloning with compacting user messages, documentation updates, and test coverage details.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov

codecov Bot commented Jun 8, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 96.22642% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 90.03291%. Comparing base (63e87f0) to head (f837841).

Files with missing lines Patch % Lines
session/summary/summarizer.go 86.66667% 3 Missing and 3 partials ⚠️
Additional details and impacted files
@@                 Coverage Diff                 @@
##                main       #1932         +/-   ##
===================================================
+ Coverage   90.02527%   90.03291%   +0.00763%     
===================================================
  Files           1023        1024          +1     
  Lines         172155      172307        +152     
===================================================
+ Hits          154983      155133        +150     
- Misses         10786       10787          +1     
- Partials        6386        6387          +1     
Flag Coverage Δ
unittests 90.03291% <96.22642%> (+0.00763%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Rememorio

Copy link
Copy Markdown
Collaborator Author

fixed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant