Skip to content

perf(dashboard): gzip static files + long-term cache headers + plugin cache-bust#1

Merged
sea-monsters merged 4 commits into
mainfrom
feat/dashboard-performance-optimizations
May 19, 2026
Merged

perf(dashboard): gzip static files + long-term cache headers + plugin cache-bust#1
sea-monsters merged 4 commits into
mainfrom
feat/dashboard-performance-optimizations

Conversation

@sea-monsters

Copy link
Copy Markdown
Owner

Summary

Three coordinated changes to reduce dashboard load time and bandwidth for remote/SSH deployments:

  1. Gzip compression for JS/CSS bundles (hermes_cli/web_server.py):

    • Replaces plain StaticFiles with _OptimizedStaticFiles subclass
    • On-the-fly gzip for .js/.css when client accepts gzip encoding
    • Compresses 1.5 MB main bundle → ~450 KB (70% reduction)
    • Only compresses files >1 KB where compression is beneficial
    • Falls back to uncompressed for clients without gzip support
  2. Long-term Cache-Control headers (hermes_cli/web_server.py):

    • Adds public, max-age=31536000, immutable to all hashed assets
    • Vite-generated filenames already contain content hashes (index-XXXX.js)
    • Eliminates redundant re-downloads on every dashboard load
    • Combined with gzip: bandwidth reduction persists across sessions
  3. Plugin cache-busting (web/src/plugins/usePlugins.ts):

    • Extends cache-bust from dev-only to both dev and production
    • Appends ?v=${Date.now()} to plugin JS bundle URLs
    • Ensures plugin updates are picked up without hard browser refresh
    • Removes DEV/PROD branch — single consistent behavior

Test Plan

  • Start dashboard: hermes dashboard --port 8089
  • Open browser DevTools → Network tab
  • Verify .js/.css responses have content-encoding: gzip
  • Verify .js/.css responses have cache-control: public, max-age=31536000, immutable
  • Verify plugin JS URLs include ?v= timestamp query param
  • Load dashboard twice: second load should serve from disk cache

Backward Compatibility

  • No config changes required
  • No API changes
  • Falls back gracefully for browsers without gzip support
  • Existing plugin loading behavior preserved (loadedScripts dedup still active)

Related

… cache-bust

Three coordinated changes to reduce dashboard load time and bandwidth
for remote/SSH deployments:

1. **Gzip compression for JS/CSS bundles** (hermes_cli/web_server.py):
   - Replaces plain StaticFiles with _OptimizedStaticFiles subclass
   - On-the-fly gzip for .js/.css when client accepts gzip encoding
   - Compresses 1.5 MB main bundle → ~450 KB (70% reduction)
   - Only compresses files >1 KB where compression is beneficial
   - Falls back to uncompressed for clients without gzip support

2. **Long-term Cache-Control headers** (hermes_cli/web_server.py):
   - Adds `public, max-age=31536000, immutable` to all hashed assets
   - Vite-generated filenames already contain content hashes (index-XXXX.js)
   - Eliminates redundant re-downloads on every dashboard load
   - Combined with gzip: bandwidth reduction persists across sessions

3. **Plugin cache-busting** (web/src/plugins/usePlugins.ts):
   - Extends cache-bust from dev-only to both dev and production
   - Appends ?v=${Date.now()} to plugin JS bundle URLs
   - Ensures plugin updates are picked up without hard browser refresh
   - Removes DEV/PROD branch — single consistent behavior

## Test Plan
- [ ] Start dashboard: `hermes dashboard --port 8089`
- [ ] Open browser DevTools → Network tab
- [ ] Verify .js/.css responses have `content-encoding: gzip`
- [ ] Verify .js/.css responses have `cache-control: public, max-age=31536000, immutable`
- [ ] Verify plugin JS URLs include `?v=` timestamp query param
- [ ] Load dashboard twice: second load should serve from disk cache

## Backward Compatibility
- No config changes required
- No API changes
- Falls back gracefully for browsers without gzip support
- Existing plugin loading behavior preserved (loadedScripts dedup still active)

Closes NousResearch#20451 (bandwidth), NousResearch#18547 (prefix stabilization context)
Co-authored-by: Sea Monsters Dev <dev@sea-monsters.ai>

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ffb5232915

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread hermes_cli/web_server.py Outdated
if "content-type" in response.headers:
response.headers["cache-control"] = "public, max-age=31536000, immutable"
# Gzip compress if client supports it
if "gzip" in accept_encoding and isinstance(response, FileResponse):

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Honor q=0 before gzipping assets

When a client or intermediary sends Accept-Encoding: gzip;q=0 to explicitly disable gzip, this substring check still returns a gzip-encoded JS/CSS response. That can make the dashboard fail to load for clients that advertise gzip only with a zero quality value; parse the header quality values or delegate compression to middleware before choosing the gzip variant.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chatgpt-codex-connector ✅ Fixed. _accepts_gzip() now correctly parses Accept-Encoding quality values: gzip;q=0 → False, gzip;q=0.5 → True. Also supports x-gzip alias (RFC 2616), multi-parameter encodings, and rejects malformed q values. Commit: 12b7148

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Address Codex review feedback on PR #1:
- Replace simple substring check with proper Accept-Encoding parsing
- Parse quality values (q=0, q=0.5, q=1.0) correctly
- Reject gzip when q=0 is explicitly specified
- Conservative handling of malformed q-values (treat as rejected)

Refs: #1 (comment)
@chatgpt-codex-connector

Copy link
Copy Markdown

To use Codex here, create an environment for this repo.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0bef8c08cf

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread hermes_cli/web_server.py Outdated
Comment on lines +3771 to +3773
return Response(
content=compressed,
headers={

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve HEAD semantics for gzipped assets

When a client or intermediary sends HEAD /assets/*.js with Accept-Encoding: gzip, this branch replaces Starlette's FileResponse with a plain Response whose body is content=compressed. FileResponse has request-method-specific handling that suppresses the body for HEAD requests, but this plain response will still emit the compressed bundle body, so cache validators or health checks that use HEAD can download hundreds of KB and receive incorrect HEAD semantics. Check scope['method'] before returning the compressed response, or keep the original file response path for HEAD requests.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chatgpt-codex-connector ✅ Fixed. For HEAD requests, get_response() now sets gzip-related headers on the original FileResponse instead of returning a Response with body. FileResponse suppresses body for HEAD per HTTP spec, preserving correct HEAD semantics for cache validators and health checks. Commit: 12b7148

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Address both Codex review findings on PR #1:

1. Honor q=0 in Accept-Encoding (r3264042023):
   - Replace simple "gzip" in accept_encoding substring check with full
     Accept-Encoding parser that respects quality values (RFC 7231 §5.3.4)
   - Support x-gzip alias (RFC 2616)
   - Handle multi-parameter encodings (gzip;q=0.5;ext=foo)
   - Reject malformed q values (gzip;q=, gzip;q=abc, gzip;q)
   - Case-insensitive matching with word boundary

2. Preserve HEAD semantics for gzipped assets (r3264140246):
   - For HEAD requests, set compression headers on the original
     FileResponse instead of returning a Response with body
   - FileResponse suppresses body for HEAD per HTTP spec
   - Cache validators and health checks using HEAD get correct semantics

Additional hardening:
   - Wrap file I/O in try/except OSError for race safety
   - Merge existing Vary header if already present
   - Use "replace" error handling for latin-1 decode

Refs: #1

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 12b7148e08

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread hermes_cli/web_server.py

async def get_response(self, path: str, scope):
response = await super().get_response(path, scope)
if path.endswith(".js") or path.endswith(".css"):

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Route CSS through the optimized handler

For /assets/*.css, the earlier @application.get("/assets/{filename}.css") route is registered before the /assets StaticFiles mount, so those requests return the plain rewritten Response and never enter this .css branch. As a result the dashboard CSS still misses the new gzip and long-term cache headers in exactly the prefixed/static path this optimization is meant to cover; move the compression/cache handling into serve_css or otherwise avoid shadowing the optimized mount for CSS.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chatgpt-codex-connector ✅ Fixed. Extracted _accepts_gzip_static as module-level function shared by both serve_css and _OptimizedStaticFiles. CSS requests now get gzip compression + Cache-Control: public, max-age=31536000, immutable headers. HEAD request body suppression also handled. Commit: 66ead91

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment thread hermes_cli/web_server.py Outdated
Comment on lines +3753 to +3758
q_match = re.search(r";q=([0-9]+(?:\.[0-9]+)?)(?:;|$|\s)", params)
if q_match:
return float(q_match.group(1)) > 0
# Check for malformed q (q without valid =value)
if re.search(r";q\b", params):
return False

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Honor spaced gzip q-values

The fresh evidence is that the new parser only matches ;q= with no optional whitespace, but valid Accept-Encoding parameters can be formatted with spaces such as gzip; q=0 or gzip ; q=0. In that case this code misses both q_match and the malformed-q guard and returns True, so clients or proxies that explicitly set gzip quality to zero can still receive a gzipped bundle.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chatgpt-codex-connector ✅ Fixed. Both _accepts_gzip_static regexes now allow optional whitespace after semicolons (;\s*q= and ;\s*q\b). gzip; q=0, gzip; q=0, gzip ; q=0 all correctly rejected. Commit: 66ead91

Address Codex round 3 review findings on PR #1:

1. Route CSS through optimized handler (r3264211059):
   - serve_css was registered before the _OptimizedStaticFiles mount,
     shadowing CSS requests so they never got gzip or cache headers
   - Add gzip compression + Cache-Control to serve_css directly
   - Extract _accepts_gzip_static as module-level function shared by
     both serve_css and _OptimizedStaticFiles
   - Handle HEAD requests (no body) in serve_css gzip path

2. Honor spaced q-values (r3264211065):
   - Accept-Encoding parameters may have optional whitespace (gzip; q=0)
   - Update regexes to allow \s* after semicolons: ;\s*q=, ;\s*q\b
   - Ensures clients/proxies that use spaced q=0 syntax do not receive
     gzipped responses

Refs: #1

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 66ead91f3c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread hermes_cli/web_server.py
Comment on lines +3732 to +3737
headers = {
"content-encoding": "gzip",
"vary": "accept-encoding",
"content-length": str(len(compressed)),
"cache-control": "public, max-age=31536000, immutable",
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Add prefix variance to cached CSS responses

When the dashboard is served through a path-prefix proxy, this handler rewrites the CSS based on X-Forwarded-Prefix, but the new public one-year cache headers only vary on Accept-Encoding (or do not vary at all for the uncompressed fallback). A shared cache between the proxy and backend can therefore reuse /assets/*.css generated for one prefix for a request with a different or absent prefix, leaving font/asset URLs pointing at the wrong path until the immutable cache entry expires; include X-Forwarded-Prefix in Vary or avoid long-lived public caching for prefix-dependent CSS.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chatgpt-codex-connector ✅ Fixed. When X-Forwarded-Prefix is present, CSS responses now include x-forwarded-prefix in the Vary header so shared caches correctly segment by prefix. Both gzip (Vary: accept-encoding, x-forwarded-prefix) and uncompressed (Vary: x-forwarded-prefix) paths covered. No prefix → no extra vary (unchanged). Commit: 64b94e5

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sea-monsters sea-monsters merged commit 58d12f0 into main May 19, 2026
sea-monsters pushed a commit that referenced this pull request May 19, 2026
Address Codex review r3264262681:

When serve_css rewrites url() paths based on X-Forwarded-Prefix,
shared caches can serve prefix-A CSS for prefix-B requests because
Cache-Control: immutable does not vary on the prefix. Add
x-forwarded-prefix to Vary header when prefix is present so caches
correctly segment CSS by prefix. Affects both gzip and uncompressed
response paths.

Refs: #1 (comment)
sea-monsters pushed a commit that referenced this pull request Jun 1, 2026
Two CI flakes surfaced on PR NousResearch#34572 (both in files this PR doesn't touch;
pre-existing host-dependent flakes):

1. test_process_registry::TestPopenLeakOnSetupFailure — the failure-cleanup
   tests use a fake proc.pid (8888/9999) and assert proc.kill() runs. But
   spawn_local's primary cleanup is os.killpg(os.getpgid(pid), SIGKILL),
   falling back to proc.kill() only on ProcessLookupError/PermissionError/
   OSError. When the fake PID happens to exist on a busy host, os.getpgid
   succeeds, os.killpg fires against an UNRELATED real process group, and
   proc.kill() is never reached -> flaky AssertionError (and a real risk of
   SIGKILLing an innocent process group from a unit test). Patch os.getpgid
   to raise ProcessLookupError so the fallback path runs deterministically
   and no real killpg is ever issued.

2. test_web_server::test_resize_escape_is_forwarded — the receive loop calls
   the blocking conn.receive_bytes() with no exception guard. Once the child
   prints its winsize and exits, the PTY closes; on a missed-marker run the
   next recv blocks until the 30s pytest-timeout instead of failing fast.
   Add a try/except break (matching the working sibling tests) and bump the
   child's pre-read sleep 0.15s -> 0.5s so the resize reliably lands first.

Verified: 4/4 pass across 3 consecutive runs; root cause for #1 reproduced
(os.getpgid(1) succeeds -> old code skips proc.kill).
sea-monsters pushed a commit that referenced this pull request Jun 3, 2026
Seven Copilot inline review comments on NousResearch#37679, four worth landing
in a polish pass before merge:

1. _dispose_unused_adapter signature: 'BasePlatformAdapter' ->
   'BasePlatformAdapter | None'. The function explicitly handles
   None and the reconnect watcher calls it with None in the
   except arm, so the annotation now matches the actual contract.

2. (duplicate of #1 on a different line) — same fix.

3. except Exception in _dispose_unused_adapter — the reviewer
   asked about asyncio.CancelledError swallowing. On Python 3.8+
   (Hermes requires 3.13, see pyproject.toml), CancelledError
   inherits from BaseException, NOT Exception, so the existing
   'except Exception' does NOT swallow task cancellation. Added
   an explicit comment explaining the contract so future readers
   don't repeat the analysis. We don't re-raise because the
   watcher loop intentionally treats dispose failures as
   best-effort: a failed dispose on an unowned adapter should not
   take down the watcher that's keeping the gateway alive.

4. _response_store = None after close in api_server.py — the
   reviewer flagged this for idempotency. Decided to keep the
   non-None state intentionally: setting it to None cascades
   to ~9 callers that access self._response_store without a
   None check, and 'close() is idempotent on a closed sqlite3
   Connection' means the current code is already safe. The
   type stays stable; LSP doesn't flag a cascade of
   reportOptionalMemberAccess errors. (This matches the
   pre-existing pattern in the codebase — e.g.
   _mark_disconnected doesn't reset state to None either.)

5. _build_adapter_with_store: reviewer worried about
   disconnect() failing on the self.name property if
   __init__ wasn't called. Already handled: we set
   'adapter.platform = Platform.API_SERVER' so the
   'self.platform.value.title()' property returns
   'Api_Server' without raising. The exception-swallowing
   branch in disconnect() does call self.name via the
   logger.debug format, so this is a real path that needs
   the platform attribute, and we have it.

6. test_disconnect_closes_response_store: bare 'pytest.raises(Exception)'
   -> 'pytest.raises(sqlite3.ProgrammingError)'. The bare
   Exception matcher would silently accept AttributeError,
   OperationalError, env-related issues, etc. The specific
   exception type ('Cannot operate on a closed database') is
   the actual signal we want — proves the SQLite conn is
   closed, not just that *something* raised.

7. test_nonretryable_failure_disposes_unowned_adapter:
   assertion tightened from '>= 1' to '== 1' on
   adapter._disconnect_calls. The docstring said 'exactly once',
   the assertion now matches. Catches the hypothetical
   'watcher disposes the same adapter twice' regression that
   '>=' would have missed.
sea-monsters pushed a commit that referenced this pull request Jun 6, 2026
…ch#37677)

Anthropic enforces two independent ceilings per image:
1. 5 MB encoded byte size
2. 8000 px longest side

Hermes only guarded #1. A tall screenshot (e.g. 1200x12000 at 0.06 MB)
passes every byte check but fails the pixel check, returning a
non-retryable HTTP 400 that permanently bricks the conversation thread.

Fixes:
- error_classifier: add 'image dimensions exceed' pattern to
  _IMAGE_TOO_LARGE_PATTERNS so the 400 is classified as image_too_large
  and triggers the shrink/retry path instead of falling through to
  non-retryable error.
- conversation_compression: check pixel dimensions (via Pillow) even
  when byte size is under the 4 MB target. If max(dims) > 8000, force
  shrink.
- vision_tools._resize_image_for_vision: add optional max_dimension param.
  When set, images exceeding the pixel cap are downscaled even if they're
  under the byte budget. The resize loop now checks both byte AND pixel
  limits before accepting a candidate.

Closes NousResearch#37677
sea-monsters pushed a commit that referenced this pull request Jun 11, 2026
…bes + test-leak fix (NousResearch#40909)

* fix(gateway,windows): reliability — supervisor task, JOB breakaway, status --deep

Three coordinated fixes for the Windows gateway reliability story:

1. CREATE_BREAKAWAY_FROM_JOB on every detached spawn

   The 'hermes update' triggered from the Electron Desktop GUI ran inside
   Electron's job object. Without breakaway, the post-update gateway
   watcher spawned by update — already DETACHED_PROCESS — was still
   reaped when Electron's job tore down, so the gateway never came back
   after a GUI-initiated update. Adds CREATE_BREAKAWAY_FROM_JOB (0x01000000)
   to:
     - hermes_cli/_subprocess_compat.py::windows_detach_flags() — used by
       every helper that calls windows_detach_popen_kwargs(), including
       launch_detached_profile_gateway_restart()
     - The watcher subprocess's own respawn snippet in
       hermes_cli/gateway.py (inlined flags so the watcher's child
       respawn also breaks away)

   _spawn_detached() in gateway_windows.py already had the flag; this
   change brings the rest of the codebase to parity.

2. Per-minute supervisor Scheduled Task — Windows equivalent of
   systemd Restart=always

   Introduces hermes_cli/gateway_supervisor.py and registers it as a
   second Scheduled Task ('Hermes_Gateway_Supervisor', SC MINUTE /MO 1,
   LIMITED rights) alongside the existing ONLOGON task. Every minute,
   the supervisor uses the same gateway.status.get_running_pid() probe
   as 'hermes gateway status' and, if no gateway is alive, calls
   gateway_windows._spawn_detached() (which now includes BREAKAWAY) to
   bring one back.

   Covers every crash mode, not just 'machine rebooted': taskkill,
   OOM, GUI update SIGTERM, parent job teardown. Cheap — one pythonw
   startup per minute when down, one PID-existence check per minute
   when up.

   Wired into both the schtasks-success and Startup-folder-fallback
   install paths via _install_supervisor_best_effort(), and removed in
   uninstall(). Best-effort: a failing supervisor install logs a
   warning but doesn't roll back the primary install.

3. 'hermes gateway status --deep' shows per-probe PASS/FAIL

   Replaces the existing terse '--deep' output (which only printed
   paths) with an actual diagnostic table:
     [1] PID file present
     [2] Lock file held by a live process
     [3] get_running_pid() result
     [4] _pid_exists(pid) — OS-level liveness
     [5] gateway_state.json (state + age)
     [6] Last lifecycle event from gateway-exit-diag.log

   When the high-level summary disagrees with reality, the user can
   see exactly which signal is lying.

Test-leak fix
-------------

tests/hermes_cli/test_gateway_wsl.py::TestGatewayCommandWSLMessages
monkey-patched is_linux/is_wsl/supports_systemd_services to simulate
WSL but did NOT stub is_windows(). On a Windows host, the dispatcher
in _gateway_command_inner takes the is_windows() branch BEFORE the
WSL guidance branch, so the test invoked gateway_windows.install()
for real. install() writes to %APPDATA%\...\Startup\Hermes_Gateway.cmd
— the REAL user Startup folder, never sandboxed by tmp_path — pointing
at the test's pytest-of-<user>/pytest-<N>/.../gateway-service/ wrapper.
When pytest tore down the tmp_path, every subsequent Windows login
flashed a cmd.exe window that failed to find the missing target.

Stubs is_windows=False on all four affected tests:
  test_install_wsl_no_systemd
  test_start_wsl_no_systemd
  test_status_wsl_running_manual
  test_status_wsl_not_running

Defense-in-depth: _build_startup_launcher() now prefixes the launcher
with 'if not exist <target> exit /b 0', so any future stale Startup
entry silently no-ops instead of flashing a console window.

Status enhancements
-------------------

- status() now reports supervisor task presence alongside the existing
  schtasks/Startup info, and nudges the user to reinstall if the
  supervisor isn't registered.
- Deep mode dumps both the supervisor task name + script path.

* fix(gateway,windows): drop the per-minute supervisor task — keep breakaway + deep probes

Earlier in this branch we added a per-minute schtasks-based supervisor to
respawn the gateway after crashes / GUI-update SIGTERMs. The implementation
flashed a brief console window on every firing, which stole window focus.
We tried several variants:

  - cmd.exe wrapper invoking pythonw  -> flashes (cmd.exe is console-subsystem)
  - schtasks /TR pointing at pythonw  -> flashes (uv venv launcher pythonw is
    actually subsystem=Console, not GUI; it respawns the real pythonw)
  - schtasks /TR pointing at base uv  -> still flashes (Task Scheduler-side
    conhost preallocation; documented Windows quirk)
  - XML registration with <Hidden>true>  -> still flashes (<Hidden> only hides
    the task in the Task Scheduler UI, not the spawned window)

Researched what leading projects do:

  - Ollama: GUI-subsystem tray exe + Startup-folder shortcut. No supervisor.
  - Tailscale: real Windows Service via SCM. Session 0, no console possible.
  - Syncthing: --no-console flag inside the binary + Startup folder.
  - openclaw: VBS Run(..., 0, False) wrapper. Suppresses the *window* but
    Super User Q971162 confirms focus-steal still occurs in some cases.

None of these use a per-minute polling scheduled task. The 'auto-restart on
crash' responsibility belongs INSIDE the daemon (Tailscale's in-process
recovery / Ollama's monitor+worker pair) OR is delegated to the Windows
Service Control Manager — not Task Scheduler.

So this commit drops the supervisor entirely. The CREATE_BREAKAWAY_FROM_JOB
fix in _subprocess_compat.py (from commit c1e5fa4) survives — that is the
*real* fix for problem #2 (GUI-update kills gateway): the post-update
watcher in launch_detached_profile_gateway_restart() now breaks out of
Electron's job object, so the gateway respawn watcher survives the GUI
quit and successfully respawns the gateway.

Surviving from c1e5fa4:
  * CREATE_BREAKAWAY_FROM_JOB in hermes_cli/_subprocess_compat.py (fixes #2)
  * Inlined breakaway flag in the watcher respawn snippet in gateway.py
  * hermes gateway status --deep PASS/FAIL probes (fixes #1 — visibility)
  * 'if not exist <target> exit /b 0' guard in _build_startup_launcher
    (fixes #3 — silent no-op for stale Startup entries)
  * tests/hermes_cli/test_gateway_wsl.py is_windows=False stubs (root cause
    of #3 — pytest WSL tests no longer leak Startup entries on Win hosts)

Removed in this commit:
  * hermes_cli/gateway_supervisor.py (entire file)
  * Supervisor section in hermes_cli/gateway_windows.py (~180 lines):
      get_supervisor_task_name, get_supervisor_script_path,
      _build_supervisor_cmd_script, _write_supervisor_script,
      _install_supervisor_task, is_supervisor_task_registered,
      _install_supervisor_best_effort
  * _install_supervisor_best_effort() calls in install() (3 spots)
  * supervisor cleanup block in uninstall()
  * supervisor display lines in status() / status(deep=True)

Future direction (out of scope for this PR): the right place for Windows
'Restart=always' semantics is a real Windows Service installed via
pywin32's win32serviceutil.ServiceFramework — session-0 isolation, SCM
auto-restart, no console window possible. That's a meaningful next-PR
project, not a band-aid.

Tests: 51 pass / 2 pre-existing failures in
tests/hermes_cli/test_gateway_{windows,wsl}.py (the 2 failures are
TestSupportsSystemdServicesWSL cases that fail on origin/main too —
unrelated to this PR).
sea-monsters pushed a commit that referenced this pull request Jun 13, 2026
Add an official, production-grade WhatsApp integration via Meta's
Business Cloud API as a complement to the existing Baileys bridge.
No bridge subprocess, no QR codes, no account-ban risk — at the cost
of a Meta Business account and a public HTTPS webhook URL.

Setup is fully wizard-driven: 'hermes whatsapp-cloud' walks through
every credential with paste-time validation (catches the #1 trap of
pasting a phone number into the Phone Number ID field), generates a
verify token, and ends with copy-paste instructions for the
cloudflared / Meta-dashboard / Business Manager pieces that can't be
automated. The wizard also points users at Meta's Business Manager
for setting the bot's display name and profile picture.

Feature set:

- Inbound: text, images (with native-vision routing), voice notes
  (STT), documents (small text inlined, larger cached), reply context.
- Outbound: text with WhatsApp-flavored markdown conversion, images,
  videos, documents, opus voice notes via ffmpeg with MP3 fallback.
- Native interactive buttons for clarify, dangerous-command approval,
  and slash-command confirmation flows — matches the Telegram /
  Discord UX, graceful degrades to plain text.
- Read receipts (blue double-checkmarks) and typing indicator,
  using Meta's combined endpoint so they fire in a single API call.
- Webhook security: X-Hub-Signature-256 HMAC verification (raw body,
  constant-time), wamid deduplication, group-shaped-message refusal
  (groups deferred to v2 — Baileys still covers them).
- Full integration with the gateway's session, cron, display-tier,
  prompt-hint, and auth-allowlist systems. Cloud and Baileys can run
  side-by-side against different phone numbers.

Also wires STT (speech-to-text) through Nous's managed audio gateway
for Nous subscribers — previously the default stt.provider=local
required a separate faster-whisper install. New subscribers now get
voice-note transcription out of the box.

Docs: 418-line user guide at website/docs/user-guide/messaging/
whatsapp-cloud.md, sidebar entry, environment-variables reference,
ADDING_A_PLATFORM.md updated with the optional interactive-UX
contract for future adapter authors.

Tests: 100 dedicated tests for the adapter, 32 for the setup wizard,
20 for the Nous subscription STT wiring, plus regression coverage
across display_config, prompt_builder, and the cron scheduler.

Known limitations (deferred until clear demand signal):
- Group chats — use the Baileys bridge if you need them.
- Message templates for 24-hour-window outside-conversation sends —
  reactive chat is unaffected; cron / delegate_task with gaps > 24h
  will fail with a clear error. The agent's system prompt warns the
  model about this so it knows to mention it when scheduling delayed
  messages.
sea-monsters pushed a commit that referenced this pull request Jun 20, 2026
Phase 1 of the pluggable cron-scheduler refactor (Axis B — the trigger).
No call-site changes; this phase only makes the abstraction exist + tested
in isolation.

Task 1.1: cron/scheduler_provider.py — the EXPERIMENTAL CronScheduler ABC.
  Required surface is name + start; is_available()/stop() carry safe defaults.
  is_available has a no-network invariant. Docstring marks it experimental
  until the Chronos provider (Phase 4) validates the shape.
Task 1.2: InProcessCronScheduler wraps the historical 60s ticker loop, calling
  cron.scheduler.tick(sync=False) exactly as the raw ticker does. Uses
  stop_event.wait(interval) for responsive stop (both raw tickers already do).

Tests: ABC-is-abstract, default-is_available, the InProcess loop drives tick
and stops, stop() no-op, and test_abc_growth_stays_additive (the forward-compat
guard: required abstractmethods must stay exactly {name, start}, so the three
Phase-4 hooks land as NON-abstract additions).

tick() internals in cron/scheduler.py are byte-unchanged (only new file added).
Phase 0 characterization tests still green. Full tests/cron/: 445 passed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant