Skip to content

chore: merge dev branch fix/macos backend-install-ux#8197

Merged
qnixsynapse merged 33 commits into
mainfrom
fix/macos-backend-install-ux
May 27, 2026
Merged

chore: merge dev branch fix/macos backend-install-ux#8197
qnixsynapse merged 33 commits into
mainfrom
fix/macos-backend-install-ux

Conversation

@qnixsynapse

@qnixsynapse qnixsynapse commented May 26, 2026

Copy link
Copy Markdown
Contributor

Describe Your Changes

Sampling & chat composer

  • Restrict sampler UI to local/custom providers — predefined remote providers (OpenAI, Anthropic, Gemini, Groq, etc.) reject unknown sampler fields; hide the composer SamplerPopover for them and strip paramsSettings keys from the request body so assistant overrides set on a local model can't leak into a remote request. (2147fea)
  • Provider-aware sampler popover — sampler capability matrix per built-in provider, plus permissive defaults for custom OpenAI-compatible endpoints. (3182940)
  • Align reasoning trigger with sibling icon buttons in ChatInput — drop the inline on/off/auto label so the row keeps a consistent rhythm; state still readable via icon color + tooltip + dropdown checkmark. (bd1ccba)
  • Allow media-only messages — send images/audio without requiring text. (1c1c8aa)

Threads & error UX

  • Persist metadata.error across restart and dedupe the error UI — errors now survive thread reload. (5eeced0, f6e4911)
  • Stop persisting empty assistant rows; pin errors to the user message. (827d9dd)
  • Scope llamacpp router error banner to llamacpp threads — no more cross-provider banner pollution. (ea4206d)
  • Persist OOM/backend banner per-thread across restart — the banner state survives reload instead of being stuck on whichever thread was open when the error fired. (11f0f28)

llamacpp

  • Include mmproj.gguf in VRAM precheck. (20b705e)
  • Format Error/object log args instead of dumping [object Object]. (1780ff7)
  • Split version/backend selectors and overhaul settings UX. (5e79398)
  • Backend dependency verification: add a toggle (956fc13), scope to GPU backend lib + skip on flatpak (ae49fed), and skip the startup check when auto-update is off (f5de440).
  • Allow .tar.gz backends in the macOS file picker. (2f03053)
  • Name imported GGUFs from general.name — read the metadata, trim and dash-join whitespace; fall back to the file basename (sans .gguf), then modelId. (ec0d85d)
  • Cap embedding-slot bonus at +1 in router models_max — was inflating by N per installed embedder, but only one embedding loads at a time; the phantom slots stalled chat-model eviction. (a4ac378)
  • Emit onSettingUpdate from overridden updateSettings so the router restarts when version/backend changes via the API path, not just the UI. (a4047c9)

MCP

Models / providers

  • Anthropic-compatible custom providers — Add Provider dialog gets an API-format selector (OpenAI / Anthropic). api_type discriminant on ProviderObject routes the model factory and ai-model.ts through @ai-sdk/anthropic when set, so LiteLLM/Bedrock proxies and self-hosted Claude gateways work without a built-in entry. Built-in Anthropic backfilled via a v15→v16 store migration. Custom providers now require an API key (matches existing backend gates). (b9c3fbf)
  • Replace dropdown compat indicator with hub estimator for faster, richer model-fit info (6d94961); drop the ctx value from the tooltip to reduce noise (c9ba4d8).
  • Don't carry stale metadata across provider refetches. (18f1281)
  • Render single-option dropdowns as inert text — no fake interactivity when there's nothing to pick. (7c372cb)

Reliability

  • Isolate onLoad failures so one bad extension can't gate the entire UI. (1c7658c)
  • Repair broken test mocks after the sampling/router refactors. (34e376e)
  • Tighten hardware ACL + gate sampler UI on hydration — restrict refresh_system_info permission and avoid rendering sampler controls before settings hydrate (no more flash of stale defaults). (99b075f)

Fixes Issues

Self Checklist

  • Added relevant comments, esp in complex areas
  • Updated docs (for bug fixes / features)
  • Created issues for follow-up changes or refactoring needed

@github-actions

github-actions Bot commented May 26, 2026

Copy link
Copy Markdown
Contributor

Barecheck - Code coverage report

Total: 49.23%

Your code coverage diff: 0.60% ▴

Uncovered files and lines
FileLines
web-app/src/containers/ChatInput.tsx154, 213-214, 261-262, 285, 290-292, 314-315, 317-319, 349-351, 361-363, 365-379, 381-384, 386, 388-392, 394-402, 404, 407, 409-425, 429-431, 433, 435-443, 446-449, 452-455, 457-461, 463, 469, 474-477, 480-483, 503-504, 527-532, 535-543, 553, 558-562, 564-575, 577, 579-588, 590-593, 597-606, 616-626, 628-638, 640-647, 649-651, 653-659, 661-664, 666-670, 672-677, 679-684, 686-687, 689-692, 694-701, 703-720, 722-732, 734-748, 750-756, 758-772, 775-795, 797-800, 802-812, 814-815, 817-822, 824-825, 827-833, 835-838, 840-844, 846-854, 857, 860-865, 867-881, 883-886, 889-899, 902-908, 911-913, 915-916, 919, 921-924, 927-928, 931-934, 936-937, 940-943, 945-964, 967-971, 973-974, 976-978, 980-991, 993-1009, 1011-1013, 1015-1026, 1028-1031, 1033-1035, 1037-1073, 1076-1080, 1082-1087, 1089-1093, 1095-1096, 1098-1103, 1107, 1109-1110, 1112-1115, 1117-1120, 1123-1136, 1140-1143, 1145-1183, 1185-1197, 1199-1201, 1203-1223, 1228-1234, 1237-1272, 1277-1287, 1289-1291, 1293-1294, 1296-1297, 1300-1303, 1305-1314, 1316-1325, 1327-1333, 1335-1338, 1340-1341, 1347-1352, 1355-1356, 1359-1363, 1366-1371, 1374-1376, 1378-1382, 1384-1385, 1387-1395, 1397, 1399-1400, 1402-1413, 1416-1437, 1439-1441, 1444-1447, 1449-1450, 1452-1453, 1455-1460, 1463-1469, 1471-1475, 1478, 1480-1484, 1487-1494, 1496-1499, 1501-1503, 1505-1516, 1518-1524, 1526-1532, 1535-1538, 1540, 1587-1589, 1610-1616, 1618, 1620-1625, 1627, 1642-1648, 1650-1653, 1657-1661, 1693-1696, 1786-1796, 1799-1810, 1818-1821, 1830, 1857, 1861-1871, 1874-1878, 1880-1886, 1888-1899, 1903-1907, 1909-1918, 1923, 1925-1929, 1931-1933, 1936-1938, 1940-1942, 1944-1950, 1952-1959, 1961-1966, 1968-1974, 1976-1983, 2018-2030, 2035-2072, 2076-2098, 2100-2116, 2118-2119, 2121, 2123-2124, 2126-2127, 2129, 2131-2132, 2134-2135, 2137, 2139-2141, 2143, 2149-2158, 2169-2176, 2209, 2211-2214, 2221-2228
web-app/src/containers/McpRouterModelPicker.tsx1-2, 7-12, 36-50, 52, 56-74, 76-85, 87-94, 96-102, 104-111, 113-120, 122-130, 132-141, 143-148, 150, 152-155, 157-160, 162-178, 180, 182-188, 190-195, 197-200, 202-207, 209-213, 215-225, 227-230, 232-238, 240-241, 243-244, 246-247, 249-252, 254
web-app/src/containers/MessageItem.tsx137, 175-182, 223-224, 236-237, 239-240, 281-285, 312-317, 319-324, 326, 341, 351-358, 360-361, 363, 365, 370-371, 462, 472, 484-489, 491-492, 499, 533, 552, 582-589, 591-602, 604-606, 608, 628-631, 635, 659-661, 663-669, 676-682, 684-690, 692
web-app/src/containers/ModelSetting.tsx1-3, 5, 13-22, 24, 26-30, 37-44, 49-63, 69, 71-125, 127-128, 131-134, 136-140, 143-155, 158, 160, 162, 165, 168-170, 174-182, 184-192, 198-202, 204-219, 221-224, 226-243, 245-247, 249-264, 266-271, 273-276, 279-280, 282-286, 288-304, 306-324, 326-330, 332, 342-345, 348-351, 353-367, 369-373, 375-376, 381-389, 391, 393, 395-409, 411-415, 417-438, 440-470, 472, 474, 476-485, 494-515, 517
web-app/src/containers/ModelSupportStatus.tsx1-3, 9-11, 24-32, 34-42, 44-66, 68, 70-75, 77-78, 80-97, 99
web-app/src/containers/ParametersSection.tsx79, 88-108, 110-113, 138-144, 148-152, 194-195, 202-206, 209-213, 236-240, 243-244, 257-274, 276-299, 301-303, 305, 341-343, 352-353, 355, 357, 386-389, 397-401, 403-407, 433-435, 444-447, 456-460, 462-470
web-app/src/containers/SamplerPopover.tsx71, 79-81, 84-91, 94-95, 98-101, 104-105, 108-111, 114-119, 125-127, 149-152, 177, 179, 210, 212, 224-235, 237, 239-242, 245-247, 249-255, 257-263, 265-270, 272-275, 277
web-app/src/containers/SettingsMenu.tsx44-83, 128, 313-314, 324-327
web-app/src/containers/ThreadList.tsx82-85, 88, 94-97, 115-117, 123-129, 131-136, 138-140, 154, 159-161, 168, 206-212, 214-219, 224-239, 241-244, 251-254, 293
web-app/src/containers/dialogs/AddEditAssistant.tsx66-74, 77-78, 110-111, 114-119, 122-130, 133-134, 137-142, 188-192, 207-213
web-app/src/containers/dialogs/AddProviderDialog.tsx42-47, 50-61, 64-66, 71, 74, 83-85, 125-127, 142-147, 151
web-app/src/containers/dialogs/BackendUpdater.tsx1-2, 4-5, 7-10, 12-18, 20-29, 34-37, 39-42, 44-49, 52-54, 57-59, 62-67, 69-75, 77-95, 97-103, 105-110, 112-120, 122, 124, 126
web-app/src/containers/dialogs/LlamacppOomListener.tsx8-26, 28-45, 50-74
web-app/src/containers/dynamicControllerSetting/DropdownControl.tsx21-26, 28-32, 34-35, 37, 39-57, 59-63, 65
web-app/src/containers/dynamicControllerSetting/SliderControl.tsx52-58, 61-70, 88, 100, 108-110
web-app/src/containers/dynamicControllerSetting/index.tsx67-74, 76-82, 85-89, 92-98, 101-108, 125-129, 131
web-app/src/hooks/useAssistant.ts26-27, 63-64, 84-85, 99-100, 121, 141, 152
web-app/src/hooks/useBackendUpdater.ts35-42, 44-46, 48-49, 82-86, 109-111, 116-120, 122-127, 129-130, 149-157, 160-170, 184-186, 194-195, 198-202, 235, 237-247, 267, 269-281, 283, 285-287, 289-292, 295-297, 299, 301, 303-308, 310-312, 314-315, 317-323, 325, 329, 331, 333-335, 337, 339-343, 345-347, 349-353, 356-357, 359, 361-374, 378-386, 388-399, 403, 405-407, 409, 411, 413-418, 420-422, 424-425, 427-429, 432-433, 438-446
web-app/src/hooks/useModelProvider.ts54, 85, 97-99, 103, 171-173, 182, 188, 205-206, 213, 215, 218-219, 251, 255-256, 263-272, 275-297, 302-305, 384-394, 447-449, 559-563, 566-585, 670-672, 677-679
web-app/src/lib/ai-model.ts1-2, 4, 24-40, 42-64, 73-80, 84, 86-92, 94-95, 98-101, 106-119, 122-126, 128-131, 133-139, 141-142, 144-145
web-app/src/lib/custom-chat-transport.ts99-106, 109-118, 359-360, 485-500, 508-536, 580-586, 589-590, 599-602, 604, 607-611, 613-616, 618-619, 622-624, 626-628, 630-647, 649-651, 654-657, 659, 661-676, 680-684, 686-713, 715-738, 740-741, 747-751, 753-756, 758-764, 766-772, 774-788, 806, 814-816, 820-826, 828, 830-833, 835-836, 838-841, 846, 848-857, 859-860, 865, 869-891, 893, 900-906, 908-909, 911-912, 914-916, 918-922, 925-932, 935, 937-943, 945, 947, 949-957, 960-963, 965-970, 972-978, 981-987, 989-991, 993-1020, 1022-1035, 1039-1043, 1046-1048, 1050-1052, 1054-1062, 1064-1065, 1067-1074, 1076-1083, 1086-1087, 1092-1094, 1097-1098, 1101-1107, 1109-1126, 1128-1146, 1148-1163, 1166-1172, 1177-1179, 1181-1182, 1201-1221, 1236-1264, 1271-1287, 1299, 1318
web-app/src/lib/extension.ts132-134, 236-237, 254-255
web-app/src/lib/messages.ts40, 66, 99, 246-250, 353-362, 364-370, 372-375, 378-393, 455-465
web-app/src/lib/model-factory.ts128-129, 131-132, 139-140, 142-152, 232-233, 313-315, 325-326, 360-361, 399, 401-409, 411, 415-423, 428-441, 469-474, 476-479, 533-534, 536-555, 568, 593-594, 649, 696-706, 905-914, 916-925, 927-932, 934-935, 998, 1034-1039
web-app/src/lib/predefinedParams.ts125-129, 140-144, 155, 210, 221, 278, 298, 319, 330, 341, 379-383, 450-451
web-app/src/lib/providerCaps.ts231-232, 244-248, 250, 254-256, 258-261, 263-265, 294-297, 299-300
web-app/src/providers/ExtensionProvider.tsx1-6, 12-18, 20-22, 24-26, 28-31, 33-36, 41-49, 51-52, 54-59, 61-62
web-app/src/providers/GlobalEventHandler.tsx1-6, 12-15, 19-20, 22-29, 31, 33-42, 44-46, 49-50, 53-54, 56-69, 76, 79, 82-85, 88-89
web-app/src/routes/settings/providers/$providerName.tsx127, 129-131, 134-135, 137-139, 141-142, 166-168, 170, 172, 175-180, 182-184, 186-192, 197-198, 200-207, 209-210, 212-224, 226-227, 242-248, 254-256, 258-259, 313, 316, 349-352, 378, 380, 382, 397, 436-437, 449, 458-464, 482-483, 490-492, 519-529, 560-563, 565-570, 572-582, 633, 675, 753-761, 794-795, 801-804, 830, 832-833, 899, 909-914, 916, 918, 933, 1101-1102, 1222-1232, 1234, 1243-1246, 1248-1252, 1269-1271, 1273-1274, 1367-1376, 1378-1383, 1385, 1389-1393, 1395-1413, 1415-1418, 1420-1423, 1425, 1427-1431, 1433-1436, 1438-1441, 1443-1451, 1453-1456, 1458-1459, 1461, 1463-1473, 1475-1479, 1481, 1483-1484
web-app/src/routes/settings/providers/index.tsx37-78, 114-127, 129-142, 144-148, 150-165, 167
web-app/src/routes/threads/$threadId.tsx122-129, 147, 205-206, 211-213, 216-221, 223, 227-237, 239, 244-250, 255-266, 268-271, 273-277, 279-288, 291-292, 295-296, 299-300, 302-304, 306-307, 310-314, 316, 318-325, 327, 330-343, 345-348, 350-364, 366-376, 379-381, 383-388, 390-451, 453-454, 472-483, 523-524, 529-530, 549-550, 557-559, 561, 564-568, 570-575, 580-590, 592, 594-602, 604-607, 628-633, 653-660, 675-677, 684-693, 703-717, 720-731, 733-743, 748-749, 772-779, 805-807, 809-814, 844-845, 851-860, 892-893, 895-896, 1025, 1027-1029, 1031-1034, 1036, 1039-1042, 1044-1051, 1053-1057, 1059-1071, 1073-1074, 1076-1078, 1084-1099, 1104-1109, 1111-1113, 1128-1133, 1143-1144, 1146-1147, 1162-1169, 1175-1176, 1183-1191, 1198-1206, 1215-1225, 1277-1292, 1295-1297, 1302, 1308-1325, 1327, 1329-1338, 1340-1358, 1360, 1362, 1364-1368, 1370-1372, 1374-1376
web-app/src/stores/message-errors.ts14-18, 22-24

@qnixsynapse qnixsynapse force-pushed the fix/macos-backend-install-ux branch from f9aebb3 to 41f6781 Compare May 26, 2026 12:28
NSOpenPanel resolves filter extensions via UTType.typeWithFilenameExtension:,
which only handles single-component extensions. `tar.gz` returns nil and
`gz` resolves to a sibling UTI of `.tar.gz` files, so neither enables them
in the picker. Skip the filter on macOS and rely on the existing extension
check in installBackend() to reject invalid picks.
…ate the UI

Promise.all rejected the whole load() on the first failing onLoad, and
ExtensionProvider only set finishedSetup on success — so a single throw
(e.g. ggml_backend_error from llamacpp's configureBackends) hid the
entire app, blocking MLX/OpenAI/Anthropic/etc. providers that had
otherwise loaded fine.

- ExtensionManager.load() uses Promise.allSettled and logs each rejected
  extension by name.
- ExtensionProvider wraps setup in try/finally so setFinishedSetup(true)
  always fires.
oomError/backendError are populated from global Tauri events emitted by
the llamacpp router, but the chat route rendered the banner regardless
of which provider the current model belonged to. A router-side Metal
init crash would surface as "GGML backend encountered an error" on top
of an MLX/OpenAI/Anthropic chat that was otherwise working.

Mask the raw values behind selectedProvider === 'llamacpp' so the
banner, the implicit stop() on error, and the Reload button only fire
for llamacpp-backed threads. Store contents are untouched; switching
back to a llamacpp model resurfaces the error as before.
…tpak

Previous flow ran lddtree against llama-server itself, whose transitive
tree (libc, libstdc++, libcurl, libgomp, conditional codepaths) produced
mostly-noise "missing" entries. The user-actionable question is whether
the GPU backend (ggml-cuda / ggml-vulkan) can resolve its system deps
at dlopen time — that's what actually fails when CUDA/Vulkan runtimes
are absent.

- verify_backend_dependencies now scans bin_dir for files whose name
  contains "cuda" or "vulkan" and has a .so/.so.N/.dll extension and
  passes only those to the analyzer. llama-server is no longer
  analyzed. CPU-only backends produce an empty path list and verify
  trivially.
- verify_backend_installation short-circuits to verified=true on Linux
  flatpak (jan_utils::system::is_flatpak()); sandbox library layout
  makes lddtree results meaningless and the checker was firing false
  positives there.
Lets users opt out of the startup GPU backend library check. Defaults to
enabled to preserve current behavior.
The BackendUpdater dialog ran checkForUpdate() unconditionally on mount,
hitting the network even when the user had disabled auto-update. Gate
the call on autoUpdateEnabled; the manual "Check for Updates" button in
provider settings remains the opt-in path.
The dropdown's ModelSupportStatus called read_gguf_metadata + the heavy
isModelSupported probe every time the selected model changed. Swap to
the estimateModelFit heuristic the hub already uses (file size + KV
heuristic against RAM/VRAM) — cheap, synchronous after the one-shot
sizeBytes lookup, and good enough for an at-a-glance indicator.

Tooltip now labels the result as an estimate.
Move sampler editing out of the model sidebar into a composer-anchored
popover scoped to the active assistant, and gate the whole surface on a
provider/model capability table so users can't accidentally send params
the backend will reject.

- New `providerCaps.ts` maps each built-in provider to supported/maybe
  sampler capabilities; custom providers default to permissive. Adds
  `isModelLevelRejected` for family-specific rejections (OpenAI o-series
  and gpt-5* reject temperature/top_p/penalties; grok-3-mini/4 quirks)
  and `getMutualExclusionDrops` for cross-param conflicts (Anthropic's
  temperature+top_p).
- `ModelFactory.createModel` strips unsupported samplers at the single
  dispatch chokepoint so the wire request never carries rejected keys.
- `createCustomFetch` now retries once with all injected sampling params
  stripped when the upstream returns a sampling-rejection error, and
  toasts the user so they know their overrides were dropped this turn.
- `predefinedParams.ts` gains typed `ParamDef` schema with capability,
  controller props, `disabledBy` for live mutual-exclusion gating
  (mirostat shadows top-k/top-p/min-p, dynatemp gates its exponent, etc),
  effect hints, and category/group metadata.
- `ParametersSection` replaces the chip wall with a categorized
  "Add parameter" menu and grouped blocks for coupled samplers (mirostat,
  dry, xtc, dynatemp). Active rows render in stable canonical order with
  inline warnings when the current provider/model rejects the key.
- `SamplerPopover` adds an assistant switcher to its header (subsumes the
  legacy + > Use Assistant submenu and the standalone bot avatar button
  in ChatInput) plus a gear-icon shortcut to the assistant settings
  route. Bounded to `--radix-popover-content-available-height` so the
  header stays anchored when the body overflows.
- `ModelSetting` sidebar skips keys present in `paramsSettings` so
  sampler rows no longer duplicate between the sidebar and the popover.
- `SliderControl` drops the negative-margin hack that clipped the value
  input, inlines min/max scale labels, and supports a `warnAbove`/
  `warnBelow` band to tint the slider range for risky values.
Provider `models` arrays can carry duplicates (registry + locally
imported, or upstream `/v1/models` returning the same id twice),
producing duplicate React keys in the picker rows.
The auto-reconnect monitor introduced in #7791 ran list_all_tools every
2s against every connected server, and the stderr forwarder logged
every line at WARN regardless of the server's reported level. STDIO
servers (Python MCPs in particular) echo a ListToolsRequest INFO line
to stderr on each probe, drowning the log in WARN entries.

- Health probe interval 2s → 30s. Reconnect signal still races the
  timer via tokio::select! so explicit reconnects stay instant.
- Route stderr lines through the level token the server itself prints
  (ERROR/WARN/DEBUG/TRACE), defaulting to INFO when no token is
  present — stderr != error on most servers.
The Zustand merge of incoming settings into the providers store spread
the existing controller_props on top of the fresh ones, so metadata
the extension recomputes (recommended, options) survived across
refetches and outlived the underlying setting. The user's `value`
selection is the only field that should be preserved — keep that and
let everything else come from the fresh fetch.
A dropdown with one option implies a choice the user doesn't have.
Render the value as plain text when options.length <= 1 — no chevron,
no popover. Applies globally to DropdownControl.
- Schema: replace the single composite version_backend setting with
  two independent llamacpp_version + llamacpp_backend selectors.
  Migration in onLoad splits any prior version_backend string.
  Internal version_backend lives on as a derived field so the Rust
  plugin and router commands stay untouched.

- New check_for_updates toggle (default on) gates the remote release
  fetch. auto_update_engine now requires it. Lets users disable update
  checks without hiding the dropdown options they already have, and
  the manual "Check for Updates" button always hits the network.

- Recommended backend is computed from the upstream-released set only
  via the new fetchRemoteBackends helper. Side-loaded custom backends
  from "Install from File" no longer bias the recommendation. When
  remote is unavailable (offline / check_for_updates off) no hint is
  surfaced — the previous "fall back to merged" behavior caused
  custom installs to be recommended over official ones.

- Persistence moves from localStorage to <jan_data>/llamacpp/settings.json.
  Atomic-ish writes via tmp + mv, serialized through a single
  writeChain so concurrent writes don't interleave. One-shot
  idempotent migration on onLoad: file presence is the marker;
  localStorage is only cleared after a successful parse + file write.
  Survives localStorage wipes, lets users inspect / edit the file.

- Drops core's "preserve old recommended" surprise — registerSettings
  is overridden to write through the file store and the unawaited call
  in configureBackends is now awaited so persistRecommended doesn't
  race the merge.

Tests: 106/106 (extension) + provider settings + hooks unchanged.
…r message

Errored generations were being persisted as empty assistant messages
because extractContentPartsFromUIMessage always padded to length 1 with
an empty-text fallback. After a few reload-and-retry cycles, threads
would render N empty rows with timestamps and action icons but no
content.

- onFinish now gates persist on uiMessageHasMeaningfulContent, which
  inspects the raw UIMessage parts (ignoring empty-text fallbacks and
  bare tool stubs). Empty assistant messages never reach disk.
- On status === 'error', stamp the most recent user ThreadMessage with
  metadata.error so the failure survives reload and thread navigation.
- A successful assistant onFinish strips metadata.error from prior
  messages — forward progress clears stale errors. Editing the user
  message clears it too.
- On thread load, drop and delete any persisted assistant rows matching
  threadMessageIsEmpty. Lossless one-shot cleanup for users who already
  hit the bug.
- MessageItem renders an inline destructive-tinted error card under
  user messages with metadata.error, with a Regenerate button wired to
  the existing onRegenerate flow.
ChatInput.test: added providers:[] to the useModelProvider mock so the
new SamplerPopover doesn't blow up reading providers.find, and stubbed
Link on the @tanstack/react-router mock.

AddEditAssistant.test: extended the @/lib/predefinedParams mock with
the exports introduced by the sampler refactor (paramCategories,
paramGroups, LLAMACPP_ONLY_PARAM_KEYS, evaluateDisabled,
isGroupedParamKey), added a ResizeObserver polyfill needed by Radix
sliders, and removed three obsolete tests that drove the pre-refactor
chip-palette UI; the surviving cases cover save/edit/validation.
The isModelSupported gate counted model weights + KV cache only, so
multimodal models with a sibling mmproj.gguf got greenlit on systems
where the projector pushed total allocation past free VRAM, then
crashed with a CUDA OOM during slot init.

Stat mmproj.gguf next to model.gguf (local paths only) and add its
size to total_required.
The extension's logger template-stringed every arg into the file log,
so `logger.error('Error in load command:', err)` wrote
"[object Object]" — useless when diagnosing model-load crashes.

Format Errors as `message\nstack`, objects as JSON, primitives as-is.
The send button and Enter handler required non-empty prompt text, so
users couldn't ask a multimodal model to describe an image (or
transcribe audio) without typing a placeholder. Permit submit when
any image/audio attachment is ready, even with empty prompt.
create_message and modify_message could race when the UI stamps
metadata onto a freshly-sent user message immediately after sending
(e.g. metadata.error after a CUDA-OOM model-load failure). If the
modify's UPDATE landed first, it silently affected 0 rows; the create
then INSERTed the original row, dropping the metadata edit.

- modify_message now UPSERTs (ON CONFLICT(id) DO UPDATE) so a stamp
  ahead of the create still lands on disk.
- create_message uses INSERT OR IGNORE so a late create does not
  clobber the row already inserted by the modify.
Stack of fixes for the per-turn error UI.

1) JSONL race. Desktop persistence is messages.jsonl (SQLite is mobile
   only). modify_message bailed silently when the message id was not
   yet in the file — exactly the case when the UI stamps metadata onto
   a freshly-sent user message before create_message acquired the
   per-thread lock. Edit was dropped, restart found metadata=null.
   - modify_message upserts when no row matches the id.
   - create_message dedupes by id under the lock so a late create after
     a modify-upsert does not duplicate or clobber.

2) Per-message error store. The old approach stamped errors onto the
   AI SDK UIMessage's custom metadata in chatMessages. AI SDK reshapes
   chatMessages on subsequent sendMessage calls, dropping the custom
   field, so the inline card vanished as soon as the user sent another
   message. New useMessageErrors Zustand store keys errors by id and is
   immune to that reshaping. ThreadMessage's metadata.error is still
   written for restart restoration; the thread-load path hydrates the
   store from those persisted fields. MessageItem reads from the store.

3) ThreadList wipe. ThreadList eagerly fetched messages on mount and
   wrote them back via setMessages(threadId, fetchedMessages). The
   truthy guard `if (fetchedMessages)` always passed because an empty
   array is truthy, so for a brand-new thread (empty on disk) it raced
   the optimistic addMessage write and clobbered it with []. Gate on
   length > 0.

4) Banner dedupe. Narrow the global error banner to the three llamacpp
   signals (oom / backend / context-limit) which have unique UI; the
   inline card owns generic useChat errors. No more duplicate cards on
   regenerate-after-error.
Predefined remote providers expose a fixed sampling surface and reject
unknown JSON fields (e.g. Gemini 400s on temperature/top_p/top_k). Hide
the composer SamplerPopover for those providers and strip all
paramsSettings keys from the request body so stored assistant overrides
set while on a local model can't leak into a remote request.
Reasoning toggle used size=sm with an inline on/off/auto label, making
it taller and wider than its neighbors and breaking the row rhythm.
Drop the inline label and switch to icon-xs; state is still conveyed
via icon color/opacity, tooltip, and the dropdown's checkmark.
tsc -b (project-references build) was rejecting the unknown-typed
existingValue spread into controller_props.value as {} | null |
undefined. Narrow to string | number | boolean | undefined so it
matches ProviderSetting.
ESLint forbids _-prefixed unused vars. Use object spread + delete to
drop a key instead of destructuring it into an underscore-named bind.
useMessages selector reached s.messages[threadId]; the test mock has
no messages key, so the optional chain prevents a TypeError without
changing prod behavior.

Drop three tests that asserted the old global error banner for generic
chat errors. That banner was narrowed to contextLimitError/oomError/
backendError when error UI moved to per-message via useMessageErrors;
the tests asserted DOM that no longer renders.
Sweep across the main src-tauri crate and 5 plugins (hardware,
llamacpp, mlx, rag, vector-db). ~88 warnings eliminated. Idiomatic
fixes throughout: Default impls, &Path over &PathBuf, .as_deref(),
strip_prefix, .is_ok()/.is_some_and(), collapsed if-let chains, ?
over manual None-returning matches, trait-bound consolidation in
generics, etc.

10 #[allow(clippy::too_many_arguments)] left on stable internal and
Tauri-command signatures where refactoring to argument structs is
out of scope for a lint sweep. One #[allow(clippy::zombie_processes)]
on the intentionally-detached setsid child in jan-cli.

No new unsafe. No --fix autofixes. cargo clippy --all-targets
-D warnings is clean across every crate; cargo build succeeds.
Read general.name from gguf metadata, trim and replace whitespace runs
with '-'. Fall back to the model file basename (sans .gguf), then to
modelId if metadata is unavailable.
…I on hydration

- Register refresh_system_info in the hardware plugin's COMMANDS and
  default permission set; without it the visibility-change handler hit
  'Command plugin:hardware|refresh_system_info not allowed by ACL'.
- Disable SamplerPopover trigger while useAssistant is loading so edits
  can't land on the hardcoded in-memory default and get clobbered when
  setAssistants() resolves from disk.
- Drop hardcoded sampler params from the in-memory defaultAssistant;
  the assistant-extension seeds them to disk on first run and the file
  is the only source of truth.
The file-backed updateSettings override dropped the base class's
onSettingUpdate dispatch loop, so changes to ctx_size, n_gpu_layers,
flash_attn, and other PRESET_AFFECTING_KEYS were persisted but never
propagated to this.config or scheduleRouterRestart — the live router
kept serving requests with the old preset until the next app restart.

Diff the new value against the persisted one, write the file, then
invoke onSettingUpdate for each actually-changed key. The existing
isUpdatingBackend guard keeps the version_backend recursion safe.

Regression introduced in 5e79398 (localStorage → file persistence).
The llamacpp OOM/backend banner lived only in useAppState (in-memory),
so restart wiped it and switching threads leaked the banner from the
offending thread onto unrelated llamacpp threads.

- LlamacppOomListener stamps metadata.oomError / metadata.backendError
  onto the last user message of currentStreamThreadId at error time
  and persists via useMessages.updateMessage (modify_message upsert
  from 5eeced0).
- $threadId derives useAppState.oomError/backendError from the active
  thread's message metadata on every thread switch — single source of
  truth, no cross-thread leak.
- handleSubmit / handleRegenerate now also strip the stamped metadata
  so the derive effect doesn't resurrect a dismissed banner.
- Test fixture extended with setOomError / setBackendError mocks.
@qnixsynapse qnixsynapse force-pushed the fix/macos-backend-install-ux branch from 41f6781 to 11f0f28 Compare May 27, 2026 01:31
Custom providers can now speak the Anthropic Messages API wire format
(LiteLLM, Bedrock proxies, self-hosted Claude gateways) in addition to
OpenAI-compatible. Picks the right SDK at dispatch time via a new
`api_type` discriminant on ProviderObject; built-in 'anthropic' is
backfilled by a v15->v16 zustand migration.

- AddProviderDialog gets an API-format selector and now requires an
  API key (matches existing backend gates that silently refuse to
  register or list models for key-less providers).
- model-factory + ai-model branch to @ai-sdk/anthropic when
  api_type === 'anthropic', regardless of provider name.
- custom-chat-transport's serial-tool-use repair keys off api_type
  instead of the provider name, so custom Anthropic proxies get the
  same fixup.
- providerCaps clamps samplers (top_p/temperature mutex) for any
  Anthropic-wire provider.
The router was inflating models_max by the number of installed embedders
(`+N embedding`), but only one embedding is ever loaded at a time (RAG
issues one load() per request). The phantom slots prevented eviction of
stale chat models until the pool was genuinely full.

Cap the bonus at +1 when any embedder is installed; the log still
reports the installed count for diagnostics.
@qnixsynapse qnixsynapse merged commit 582a012 into main May 27, 2026
18 checks passed
@qnixsynapse qnixsynapse deleted the fix/macos-backend-install-ux branch May 27, 2026 04:38
@github-project-automation github-project-automation Bot moved this to QA in Jan May 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: QA

2 participants