-
-
Notifications
You must be signed in to change notification settings - Fork 8.7k
.pr_agent_accepted_suggestions
| PR 17690 (2026-06-19) |
[maintainability] Import order not formatted
Import order not formatted
Some changed BiDi classes place `import org.openqa.selenium.Beta;` before `java.*` imports (e.g., `HasBiDi`), which is inconsistent with the repository’s google-java-format output. The CI “Format” job runs `./go format` (google-java-format) and fails if it produces diffs, so this will block CI until imports are reformatted.Some modified Java files have import blocks that are not in the canonical order produced by the repo’s formatter (google-java-format). This causes CI’s format check to modify files and then fail the job because the working tree is dirty.
CI runs ./go format, which applies google-java-format to all java/**.java files, and then fails if git diff is non-empty.
Run ./go format and commit the resulting changes, or manually reorder imports to match google-java-format output (then re-run ./go format to confirm no changes).
- java/src/org/openqa/selenium/bidi/HasBiDi.java[18-23]
- java/src/org/openqa/selenium/bidi/browser/ClientWindowInfo.java[18-23]
- java/src/org/openqa/selenium/bidi/browser/SetDownloadBehaviorParameters.java[18-27]
| PR 17678 (2026-06-12) |
[reliability] `clearAll` deletes completion marker
`clearAll` deletes completion marker
`RedisBackedNewSessionQueue.addToQueue` now unconditionally calls `safelyClearRedisState(reqId)` in a `finally` block, and that helper deletes the Redis completion marker (`completedKey`) and stored result keys via `clearAll`. Removing these guard/result keys can let a later/retried `complete(reqId, ...)` “win” again and/or return `tracked=false`, which can lead to incorrect result overwrites or upstream teardown decisions in concurrent/failover scenarios.addToQueue now always runs Redis cleanup in a finally block via safelyClearRedisState(reqId), and that cleanup calls clearAll(reqId) which deletes completedKey(reqId) and the result:* keys. These keys are required for the “winner-takes-all”/idempotent completion semantics used by complete() and for communicating the terminal result across retries/replicas; deleting them can allow late/duplicate complete() calls to succeed again or return tracked=false, triggering incorrect result overwrites or upstream teardown decisions.
The completion marker (completedKey) is intended to ensure only one terminal completion wins when timeouts race with success and to keep complete() idempotent across retries/replicas via a SET-NX guard. The recent change makes addToQueue exception-safe by always cleaning up Redis state, but the cleanup is now too aggressive because it removes the completion marker and stored result keys, effectively re-opening the request for a second completion in concurrent/failover scenarios.
- java/src/org/openqa/selenium/grid/sessionqueue/redis/RedisBackedNewSessionQueue.java[245-269]
- java/src/org/openqa/selenium/grid/sessionqueue/redis/RedisBackedNewSessionQueue.java[608-636]
[reliability] `addToQueue` not exception-safe
`addToQueue` not exception-safe
`RedisBackedNewSessionQueue.addToQueue` registers per-request state in in-memory registries (`waiters`, `contexts`) and performs multiple Redis side effects, but the cleanup of those registries/keys only happens on the normal path after `awaitResult` returns. If any Redis write or other runtime error occurs before that point, the method can leak per-request registrations and leave partial/stale Redis queue state behind, especially during transient Redis failures/outages.RedisBackedNewSessionQueue.addToQueue performs multi-step side effects (registering request state in contexts/waiters, writing Redis keys, pushing a queue entry) but only cleans up that state on the happy path after awaitResult returns. If a Redis operation (or any runtime exception) occurs before the normal cleanup block, the method can exit without removing the in-memory tracking entries and without best-effort cleanup of partially-created Redis keys/list entries, leaking memory and leaving stale queue state.
This is concurrent/resource-heavy, Redis-backed queue code expected to run in failure-prone environments (timeouts, connection resets, closed connections), so cleanup must be bounded and exception-safe across all terminal paths, including partial failures while enqueuing. The current implementation lacks a try/finally to guarantee removal from waiters/contexts, and does not ensure best-effort Redis cleanup when failures happen early.
- java/src/org/openqa/selenium/grid/sessionqueue/redis/RedisBackedNewSessionQueue.java[227-256]
| PR 17657 (2026-06-08) |
[reliability] Artifact outputs can collide
Artifact outputs can collide
generate_bidi_library hardcodes the shared artifact output filenames (bidi-ast.json / bidi-model.json) independent of the macro instance name. A second invocation of this macro in the same Bazel package would attempt to declare the same output paths and fail Bazel analysis due to output collisions.generate_bidi_library always emits bidi-ast.json and bidi-model.json as output filenames. If the macro is ever instantiated more than once in the same Bazel package, Bazel will report output path collisions.
This PR introduces shared artifacts via js_run_binary (*_ast and *_json targets). These outputs are currently fixed strings instead of being scoped by the macro instance name.
- javascript/selenium-webdriver/private/generate_bidi.bzl[157-191]
- Scope outputs by the macro instance name, e.g.:
-
ast_out = name + "-bidi-ast.json"(orname + "_ast.json") model_out = name + "-bidi-model.json"
-
- Keep the public artifact names stable if needed by adding a thin
filegroup(or alias target) that re-exports the uniquely named file under a stable label, rather than a stable filename.
| PR 17650 (2026-06-06) |
[reliability] Empty driver path crash
Empty driver path crash
delete-browsers-drivers.ps1 now builds $paths only from WebDriver env vars and calls Remove-Item unconditionally. If none of those env vars are set on the Windows runner, $paths is empty and PowerShell parameter binding fails before -ErrorAction applies, breaking the workflow step and leaving preinstalled drivers in place.On Windows, Remove-Item -Path $paths ... is called even when $paths is empty (because it’s filtered from env vars only). PowerShell throws a parameter-binding error for an empty collection before -ErrorAction SilentlyContinue can suppress anything, causing the step to fail.
The workflow does not set these env vars explicitly for Windows; the step relies on the runner to provide them.
- scripts/github-actions/delete-browsers-drivers.ps1[12-20]
- .github/workflows/bazel.yml[202-208]
- Add a guard:
- If
$paths.Count -eq 0, print a message andexit 0(or discover driver locations viaGet-Command chromedriver/msedgedriver/geckodriverand delete those paths).
- If
- Optionally: set these env vars in the workflow step (if known stable locations exist), or add fallback known locations used by GitHub Windows runners.
[maintainability] Misleading script filename
Misleading script filename
The Windows step and script comments now indicate only drivers are deleted, but the script filename remains delete-browsers-drivers.ps1. This mismatch can mislead future changes/reviews into assuming browsers are still deleted on Windows.The Windows deletion behavior was narrowed to drivers-only, but the script name still suggests it deletes browsers too.
This is a maintainability/clarity issue (not functional) that could cause confusion later.
- .github/workflows/bazel.yml[205-208]
- scripts/github-actions/delete-browsers-drivers.ps1[1-20]
- Rename the script to
delete-drivers.ps1(or similar) and update the workflow invocation accordingly; or - Keep the filename but add a short comment in the workflow step noting that the script is drivers-only on Windows.
| PR 17645 (2026-06-05) |
[reliability] Quoted driver path breaks
Quoted driver path breaks
The driver_finder fixture strips quotes from --browser-binary but does not strip quotes from --driver-binary before passing it into the Service. If the driver path is single-quoted (a pattern already handled elsewhere in this repo), DriverFinder will treat it as a literal path including quotes and fail file validation.driver_finder normalizes --browser-binary by stripping surrounding single quotes, but it passes --driver-binary through without stripping. When Service.path contains quotes, DriverFinder validates it with Path(path).is_file() and will fail.
The existing test harness already anticipates quoted executable paths and strips them in other code paths.
- py/test/selenium/webdriver/common/driver_finder_tests.py[46-53]
Normalize the executable path the same way as the browser binary:
executable = request.config.option.executable
exe_path = _resolve_bazel_path(executable).strip("'") if executable else None
service = module.service.Service(executable_path=exe_path)(Optionally, consider a small helper to avoid duplicating quote-stripping logic.)
| PR 17644 (2026-06-05) |
[reliability] `xdpyinfo` loop never fails
`xdpyinfo` loop never fails
The workflow’s X server readiness wait loop can time out without failing, so the step still proceeds to start `fluxbox` (backgrounded) even when `xdpyinfo` never succeeds and no display is actually available. This violates the requirement to validate preconditions and to have deterministic timeout failure paths for async waits, and can preserve the same intermittent/silent failure mode the wait was meant to eliminate.The X server (Xvfb) readiness polling loop does not have a deterministic failure path: if xdpyinfo never succeeds within the timeout, the workflow still starts fluxbox in the background and continues, allowing silent/intermittent failures when no display is actually ready.
This step is intended to eliminate a race where fluxbox starts before Xvfb is ready. CI/workflow scripts must validate preconditions and fail safe on errors, and async waits must have explicit timeouts with actionable failure when the timeout is exceeded; without a post-loop assertion, the race is only reduced, not eliminated.
- .github/workflows/bazel.yml[171-182]
| PR 17627 (2026-06-04) |
[correctness] Null transport factory result
Null transport factory result
BiDiOptionsBuilder.UseTransport(Func) does not validate that the factory returns a non-null ITransport, so a null return will flow into BiDi.ConnectAsync and crash later when Broker calls SendAsync/ReceiveAsync on the transport.UseTransport(Func<ITransport>) validates the factory delegate itself, but not the object it returns. If factory() returns null, BiDi.ConnectAsync will pass it to Broker, leading to a NullReferenceException at runtime.
BiDi.ConnectAsync awaits TransportFactory and immediately constructs Broker(transport, bidi).
- dotnet/src/webdriver/BiDi/BiDiOptionsBuilder.cs[56-63]
Update the wrapper delegate to validate the result:
- Call
var transport = factory(); ArgumentNullException.ThrowIfNull(transport);- Return
Task.FromResult(transport)
Optionally, wrap exceptions from factory() into a faulted Task<ITransport> for consistency with async factory patterns.
| PR 17626 (2026-06-04) |
[correctness] Java rerun blocks other releases
Java rerun blocks other releases
The publish matrix exits with an error on any rerun for the Java entry before it checks whether Java is actually being released, so rerunning a ruby/dotnet/javascript-only release will still fail due to Java. This prevents reruns for non-Java patch releases (and any rerun attempt >1) even when Java would otherwise be skipped.The publish matrix currently fails on rerun attempts (github.run_attempt != '1') for the Java matrix entry even when the workflow is releasing a different language (e.g., selenium-4.28.1-ruby). This happens because the Java rerun guard runs before the “is this language selected?” check.
-
publishruns a matrix over[java, ruby, dotnet, javascript]. - For patch releases,
parse-tagsetsoutputs.languageto the tag suffix (e.g.,ruby). - On reruns, the Java matrix entry should skip when
outputs.language != 'java', but it currently fails early.
Move or gate the Java rerun guard so it only triggers when Java is actually being released:
Option A (recommended): nest the Java guard inside the selected-language branch:
if [ "${{ needs.parse-tag.outputs.language == 'all' || needs.parse-tag.outputs.language == matrix.language }}" = "true" ]; then
if [ "${{ matrix.language == 'java' && github.run_attempt != '1' }}" = "true" ]; then
echo "::error::Java release is not yet rerun-safe — ..."
exit 1
fi
./go ${{ matrix.language }}:release
else
echo skipping
fiOption B: add the selected-language predicate directly into the Java guard condition.
- .github/workflows/release.yml[144-153]
| PR 17625 (2026-06-04) |
[reliability] `UseTransport` ignores cancellationToken
`UseTransport` ignores cancellationToken
`BiDiOptionsBuilder.UseTransport(ITransport transport)` sets `TransportFactory` to return `Task.FromResult(transport)` and discards the provided `CancellationToken`, so `BiDi.ConnectAsync(..., cancellationToken)` cannot be cancelled (and may even succeed when the token is already cancelled) when this injected-transport path is used. This breaks expected .NET cancellation semantics, is inconsistent with the default WebSocket transport path that honors cancellation, and can lead to non-deterministic hangs or delayed cancellation during connection setup.BiDiOptionsBuilder.UseTransport(ITransport transport) creates a TransportFactory that ignores the CancellationToken parameter (returns a completed task via Task.FromResult(transport)), so BiDi.ConnectAsync(..., cancellationToken) cannot cancel transport acquisition and may even proceed successfully when the token is already cancelled.
BiDi.ConnectAsync forwards the caller’s cancellationToken into builder.TransportFactory(...). The default WebSocket transport factory honors cancellation during connection (e.g., via ClientWebSocket.ConnectAsync), so callers expect consistent .NET cancellation semantics across transport configuration paths; however, the injected transport-instance path currently discards the token.
- dotnet/src/webdriver/BiDi/BiDiOptionsBuilder.cs[53-56]
Update the instance-based factory to observe cancellation before returning the provided transport instance, e.g.:
- implement
UseTransport((_, ct) => ct.IsCancellationRequested ? Task.FromCanceled<ITransport>(ct) : Task.FromResult(transport)), or equivalently callct.ThrowIfCancellationRequested()before returning the transport.
[correctness] No transport null-check
No transport null-check
BiDiOptionsBuilder.UseTransport captures the provided transport without validating it, so passing null makes TransportFactory return null and later causes a NullReferenceException when Broker uses the transport. This turns a simple caller mistake into a late, non-actionable runtime crash during connection setup.BiDiOptionsBuilder.UseTransport(ITransport transport) does not validate transport and will happily set TransportFactory to return null when the caller passes null, leading to a runtime NullReferenceException downstream.
BiDi.ConnectAsync calls builder.TransportFactory(...) and passes the resulting transport into Broker, which unconditionally calls SendAsync, ReceiveAsync, and DisposeAsync on that instance.
- Add
ArgumentNullException.ThrowIfNull(transport);at the start ofUseTransport.
- dotnet/src/webdriver/BiDi/BiDiOptionsBuilder.cs[49-58]
| PR 17620 (2026-06-03) |
[reliability] `delete-browsers-drivers.sh` ignores errors
`delete-browsers-drivers.sh` ignores errors
The browser/driver deletion script does not validate inputs or command results, and it can pass empty path arguments to `rm`, producing noisy/undefined behavior while still continuing. This violates the requirement to fail safely and handle external command failures explicitly in CI automation.scripts/github-actions/delete-browsers-drivers.sh does not use strict mode and runs sudo rm -rf with possibly-empty env-derived paths, without validating that deletions succeeded.
In CI, silent cleanup failures can cause later steps to use unintended system browsers/drivers or run out of disk space without a clear signal.
- scripts/github-actions/delete-browsers-drivers.sh[7-21]
[security] rm option injection risk
rm option injection risk
`delete-browsers-drivers.sh` passes env-var-derived paths directly to `sudo rm -rf` without a `--` terminator or safety checks, so values beginning with `-` can be parsed as options and lead to unintended deletions. This is a defensive-scripting regression versus the prior cleanup pattern and can break CI runners/jobs in hard-to-debug ways.sudo rm -rf is invoked with arguments sourced from environment variables (CHROMEWEBDRIVER, EDGEWEBDRIVER, GECKOWEBDRIVER) without a -- end-of-options marker or validation. If any value begins with -, rm can treat it as an option; combined with another argument like /, this can delete unintended paths.
This script is intended to delete specific preinstalled browser/driver binaries on CI runners, so it should be robust to unexpected env var contents and avoid option injection.
- Build a list/array of paths to delete.
- Filter out empty values.
- Reject dangerous values (
/,.,..) and any value starting with-(or force them to be treated as operands via--). - Invoke
rmas:sudo rm -rf -- "${paths[@]}".
- scripts/github-actions/delete-browsers-drivers.sh[11-21]
| PR 17617 (2026-06-03) |
[maintainability] Format guidance mismatches CI
Format guidance mismatches CI
AGENTS.md now claims running `./scripts/format.sh` before pushing avoids CI formatter failures, but the CI format job still runs `./go format` and fails on any diff produced by that task. Since `./go format` runs the full Rake formatting suite unconditionally while `scripts/format.sh` is change-scoped, contributors may follow the new guidance yet still fail CI or receive conflicting instructions.AGENTS.md was updated to recommend ./scripts/format.sh / --pre-push to prevent CI formatter failures, but CI currently runs ./go format and reports failures based on that command. This creates inconsistent guidance and a real risk that contributors follow the documented workflow yet still hit CI failures.
-
./go formatexecutes the Rake:formattask, which formats across all languages unconditionally. -
./scripts/format.shis change-scoped (runs some language formatters only when matching paths changed). - CI currently enforces formatting by running
./go format(and tells contributors to run it).
Pick one clear source-of-truth and make both CI + docs consistent:
- Update CI workflows to run
./scripts/format.sh(optionally--pre-push) and update error messages to reference it, OR - Update
AGENTS.mdto explicitly state CI runs./go formatand recommend./go formatas the fallback/source-of-truth if CI fails.
References:
- AGENTS.md[69-78]
- .github/workflows/ci-lint.yml[47-74]
- Rakefile[128-144]
- scripts/format.sh[1-8], scripts/format.sh[35-137]
| PR 17615 (2026-06-02) |
[reliability] Retry catches too narrowly
Retry catches too narrowly
Driver._start_local_driver retries only on WebDriverException, so any other exception type raised during session creation (e.g., from the HTTP layer) will bypass the retry and fail immediately. This undermines the intended stabilization for intermittent startup/connectivity failures.Driver._start_local_driver only catches WebDriverException, but driver startup can raise other exception types during the initial NEW_SESSION request path (the HTTP request call site is not wrapped here, and RemoteWebDriver.start_session re-raises the original exception type). As a result, retries may not trigger for common startup-connect failures.
RemoteConnection._request directly calls urllib3 without catching exceptions, and RemoteWebDriver.start_session catches generic Exception only to stop the service and then re-raises the same exception.
- py/conftest.py[467-484]
- Expand the retry handler to catch broader exceptions for local startup (e.g.,
Exceptionor a targeted tuple that includesWebDriverExceptionplus networking errors), while still re-raising immediately on the final attempt. - Keep the warning log including exception type/message to preserve debuggability.
| PR 17614 (2026-06-02) |
[reliability] Wrong `AdditionalMessageData` JSON assertion
Wrong `AdditionalMessageData` JSON assertion
`CommandAdditionalMessageDataIsSerializedAsTopLevelFields` asserts a substring that cannot appear in the produced command JSON because it assumes `AdditionalMessageData` fields are the first properties and followed by a trailing comma, while the broker writes `id`, `method`, and `params` first and appends `AdditionalMessageData` after `params` (often as the final property). This makes the test fail (or become ordering-dependent) even when the implementation is correct, blocking CI and undermining the required coverage for the new behavior.CommandAdditionalMessageDataIsSerializedAsTopLevelFields currently asserts the sent JSON contains the substring {"baz":"qux",, which assumes property ordering (that baz is first) and a trailing comma. In reality, the message writer serializes id, method, and params first, then appends CommandOptions.AdditionalMessageData fields after params, so baz will typically not be at the start and may be the last property (no trailing comma), making the test fail even when serialization is correct.
This is a new/updated unit test intended to validate that CommandOptions.AdditionalMessageData is serialized into the top-level command envelope. To satisfy the required test coverage (PR Compliance ID 18) without brittleness, the assertion should validate the semantic behavior (a top-level property baz with value qux) in an order/format-independent way, e.g., by parsing the outgoing JSON or by using a substring match that doesn’t depend on ordering or commas.
- dotnet/test/webdriver/BiDi/SessionUnitTests.cs[158-170]
[correctness] Non-object JSON silently dropped
Non-object JSON silently dropped
`AdditionalData(string json)` accepts any JSON root (including arrays/null/primitives), but the rest of the type treats non-object roots as “empty”, so the provided data is silently ignored and won’t be merged into outgoing commands. This can hide user mistakes and make the new extensibility feature unreliable to debug.AdditionalData(string json) currently accepts any valid JSON, but AdditionalData is used as a property-bag (object) for extension properties. When callers provide a valid non-object JSON (e.g., "null", "[]", "123"), IsEmpty becomes true and the data is silently dropped (never merged/sent).
The API surface suggests the string overload is a convenience for providing additional properties. Silent dropping is surprising and makes debugging extensions difficult.
In AdditionalData(string json), validate doc.RootElement.ValueKind == JsonValueKind.Object.
- If not an object, throw
ArgumentException(orJsonException) with a clear message (e.g., "AdditionalData JSON must be an object").
- dotnet/src/webdriver/BiDi/AdditionalData.cs[39-43]
[correctness] TryGetValue throws on empty
TryGetValue throws on empty
`AdditionalData.TryGetValue` calls `JsonElement.TryGetProperty` without guarding for non-object/empty state, so calling it on `AdditionalData.Empty` (the default) can throw instead of returning `false`.AdditionalData.TryGetValue should follow the .NET Try-pattern and never throw for an empty value. Currently, AdditionalData.Empty is default (non-object JsonElement), but TryGetValue calls _data.TryGetProperty(...) unconditionally.
IsEmpty is defined as _data.ValueKind != JsonValueKind.Object, meaning Empty/default instances are explicitly non-object, yet TryGetValue does not check IsEmpty (or ValueKind) before accessing object-only APIs.
- dotnet/src/webdriver/BiDi/AdditionalData.cs[62-66]
Implement:
if (IsEmpty) { value = default; return false; }- otherwise call
_data.TryGetProperty(key, out value)
Optionally also guard the indexer similarly (or leave it throwing by design, but ensure TryGetValue is safe).
[correctness] Options AdditionalData dropped
Options AdditionalData dropped
CommandOptions.AdditionalData is never merged into the serialized "params" object, so callers setting options.AdditionalData will silently lose those fields in outgoing commands. Since command parameter objects are internal and constructed without copying options.AdditionalData, public APIs effectively cannot send params-level extension fields.CommandOptions.AdditionalData is defined as part of the public options surface, but it is never applied to the serialized command params. As a result, options-based additional params data is silently dropped, and callers cannot practically use params-level extension data because the concrete *Parameters types are internal and created inside modules.
Broker.ExecuteAsync serializes only @params as "params", and only uses options.AdditionalMessageData for envelope-level additions.
- dotnet/src/webdriver/BiDi/Broker.cs[68-109]
- dotnet/src/webdriver/BiDi/Command.cs[50-59]
- In
Broker.ExecuteAsync(centralized fix): ifoptions?.AdditionalDatais non-empty, merge it into@params.RawAdditionalData(or set@params.AdditionalDatavia the backingRawAdditionalData) before callingJsonSerializer.Serialize. - Define merge behavior for key collisions (preferably throw a clear exception if the same key already exists in
@params.RawAdditionalData). - Add/adjust tests to assert that
options.AdditionalDataappears inside the serializedparamsobject.
[reliability] Reserved keys overridable
Reserved keys overridable
Broker.ExecuteAsync writes options.AdditionalMessageData properties at the message root without filtering reserved keys, allowing duplicates of "id", "method", or "params" in the emitted JSON. If the remote end honors the last duplicate name, command correlation can break (timeouts or completing the wrong command).AdditionalMessageData is appended to the outgoing command envelope as arbitrary root-level JSON properties. Without validation, callers can provide keys like id, method, or params, producing duplicate JSON property names and potentially breaking protocol semantics and command correlation.
Outgoing envelope already writes id, method, and params before iterating AdditionalMessageData.
- dotnet/src/webdriver/BiDi/Broker.cs[95-109]
- Add a reserved-name check before writing each
AdditionalMessageDataentry.- At minimum reject:
id,method,params. - Consider also rejecting other envelope keys that appear in incoming messages:
type,result,error,message.
- At minimum reject:
- On collision, throw an
ArgumentException(or similar) with a clear message so failures are deterministic and debuggable. - Add unit/serialization tests ensuring reserved keys are rejected and non-reserved keys serialize correctly.
[maintainability] Whitespace-only line in `AdditionalData`
Whitespace-only line in `AdditionalData`
`AdditionalData(string json)` contains a whitespace-only blank line, which can violate repository formatting standards and trigger formatter/CI churn. This should be normalized (empty line with no trailing spaces).A whitespace-only blank line was introduced, which may violate repo formatting standards and be modified by the formatter.
The new line in AdditionalData(string json) appears to contain indentation/trailing whitespace on an otherwise blank line.
- dotnet/src/webdriver/BiDi/AdditionalData.cs[47-47]
[reliability] `CanGetStatusWithAdditionalData` lacks assertions
`CanGetStatusWithAdditionalData` lacks assertions
The new `CanGetStatusWithAdditionalData` test exercises the API but only asserts that `status` is non-null, so it does not actually validate that `AdditionalData`/`AdditionalMessageData` are serialized/propagated and handled as intended. This leaves the new behavior effectively untested and risks regressions where additional data is silently dropped while the test still passes.CanGetStatusWithAdditionalData() currently acts as a smoke test (it only asserts status is non-null) and does not verify the PR’s intended behavior that AdditionalData/AdditionalMessageData are actually serialized onto the wire and/or propagated, so the feature it claims to test can regress without failing.
PR Compliance ID 19 expects meaningful coverage for changed behavior; this PR adds support for serializing/propagating extra JSON properties, but the test would still pass even if the additional data is ignored/dropped. Since StatusAsync talks to a real remote end, the response may not echo extension fields, so validating serialization typically requires inspecting the outgoing message (e.g., via a stub/mock transport or a broker-level unit test), or alternatively renaming the test to reflect that it only verifies the call succeeds without error.
- dotnet/test/webdriver/BiDi/Session/SessionTests.cs[46-59]
[correctness] ExecuteAsync mutates passed parameters
ExecuteAsync mutates passed parameters
`Broker.ExecuteAsync` merges `options.AdditionalData` by mutating the caller-provided `@params.RawAdditionalData` before serialization, introducing observable side effects and breaking immutability expectations for command inputs. Because `Parameters.RawAdditionalData` is publicly readable and exposes a mutable `Dictionary`, consumers can also mutate parameter state after construction/passing to execution, leading to races and non-deterministic serialization if instances are reused concurrently.Broker.ExecuteAsync currently applies options.AdditionalData by mutating the caller-provided @params.RawAdditionalData prior to serialization, creating observable side effects and potential races if parameters are reused or accessed concurrently. Separately, Parameters.RawAdditionalData is publicly readable and exposes a mutable Dictionary<string, JsonElement>, allowing callers to mutate command inputs after construction, undermining the compliance requirement for immutable command option/input surfaces.
PR Compliance ID 8 requires command option/input types to be immutable from the public API surface to prevent post-creation mutation from affecting execution/serialization. Even if CommandOptions is init-only, mutating the provided parameters during send makes behavior non-deterministic when parameter instances are cached, reused, or concurrently accessed; exposing a mutable dictionary via a public getter similarly enables external mutation after the command parameters have been created or handed off.
- dotnet/src/webdriver/BiDi/Broker.cs[100-107]
- dotnet/src/webdriver/BiDi/Command.cs[38-40]
| PR 17613 (2026-06-02) |
[reliability] Skipped tests now run
Skipped tests now run
scripts/github-actions/ci-build.sh now runs `bazel test --config=rbe-ci //...` without the prior explicit deselection mechanism, so targets not tagged `skip-rbe` will start executing on RBE CI. Many Selenium browser tests/macros add `requires-network` but do not add `skip-rbe`, while the repo documents RBE network limitations, so this change can reintroduce CI flakes/failures.ci-build.sh now runs all tests (//...) under --config=rbe-ci without excluding the targets that were previously being deselected for RBE CI. Because test:rbe only filters -skip-rbe, any RBE-incompatible targets that are not tagged skip-rbe will now execute on RBE.
-
test:rbeuses--test_tag_filters=-skip-rbeand does not excluderequires-network. - Selenium browser test rules commonly include
requires-network(viaCOMMON_TAGS) and are frequently generated by macros (Javaselenium_test, Rubyrb_integration_test), so the safest replacement for the removed deselection list is to tag the formerly-deselected targets (or split suites) withskip-rbe.
- scripts/github-actions/ci-build.sh[15-20]
- .bazelrc.remote[35-37]
- common/browsers.bzl[1-8]
- java/private/selenium_test.bzl[22-56]
- java/test/org/openqa/selenium/chrome/BUILD.bazel[26-47]
- rb/spec/tests.bzl[181-257]
- py/BUILD.bazel[64-76]
-
Preferred: Add
skip-rbetags to the specific targets that were previously deselected (may require splitting macros/suites so only those targets get the tag). - Temporarily restore an explicit exclusion mechanism in CI until tagging is completed.
- If intentional, update RBE filtering logic to exclude the appropriate tag(s) (e.g.,
requires-network)—but only if that matches desired CI coverage.
| PR 17599 (2026-05-30) |
[security] Unmasked WebSocket URI
Unmasked WebSocket URI
JdkHttpClient.openSocket() puts the raw URI into ConnectionFailedException messages without applying the existing credential-masking helper. If the configured base URI contains userinfo (or other sensitive components), those values can be exposed via exception text/logging.openSocket() now embeds the raw URI in several ConnectionFailedException messages, but does not apply the existing maskUrlCredentials(...) logic used elsewhere in this class. This can leak embedded credentials when URIs contain userInfo.
-
maskUrlCredentialsis already implemented and used in other error paths. -
messages.getRawUri(req)returnsURI.create(rawUrl)with userInfo intact.
-
Wrap URIs in
maskUrlCredentials(...)before interpolating into exception messages. -
Avoid calling
messages.getRawUri(request)inside theURISyntaxExceptioncatch in a way that could throw a new exception; use a safe fallback (e.g.,request.getUri()), then mask. -
java/src/org/openqa/selenium/remote/http/jdk/JdkHttpClient.java[157-262]
-
java/src/org/openqa/selenium/remote/http/jdk/JdkHttpClient.java[516-546]
-
java/src/org/openqa/selenium/remote/http/jdk/JdkHttpMessages.java[123-147]
| PR 17597 (2026-05-30) |
[reliability] os-sensitive tag filtered
os-sensitive tag filtered
The new driver_finder_spec target sets `tags = ["os-sensitive"]`, but rb_integration_test filters that tag to only apply to specific browsers, so chrome/firefox/ie variants of this test will not actually be tagged os-sensitive. This breaks tag-based selection/exclusion consistency (some browser variants will be included/excluded while others won’t).driver_finder_spec is configured with tags = ["os-sensitive"], but the rb_integration_test macro treats os-sensitive as a browser-filtered tag. As a result, the driver_finder_spec-{chrome,firefox,ie} targets won’t actually receive the os-sensitive tag, making tag filtering inconsistent across browser variants.
-
rb/spec/integration/selenium/webdriver/BUILD.bazeladds a dedicatedrb_integration_teststanza for_NO_GRIDwithtags = ["os-sensitive"]. -
rb/spec/tests.bzlfiltersos-sensitiveto only apply tochrome-beta,edge,firefox-beta, andsafari.
Choose one of these approaches:
- If
driver_finder_specshould be os-sensitive for all browsers, use a universal tag (one that is not filtered) or adjust tag filtering to include all browsers this target runs on. - If it is only os-sensitive for some browsers, set tags in a way that matches the intended browsers (e.g., update
_BROWSER_TAG_FILTERS["os-sensitive"]or use a new tag name).
- rb/spec/integration/selenium/webdriver/BUILD.bazel[65-76]
- rb/spec/tests.bzl[169-179]
[maintainability] Edge cache test excluded
Edge cache test excluded
The "downloads the browser into the Selenium cache" example is guarded with `except: {browser: %i[safari ie edge]...}`, so it will never run for Edge even though this spec is executed in an Edge browser matrix. This removes coverage for validating Edge browser downloads into `SE_CACHE_PATH` under `SE_FORCE_BROWSER_DOWNLOAD`.driver_finder_spec.rb excludes browser: :edge from the "downloads the browser into the Selenium cache" example. Since the spec target is executed for Edge via the Bazel browser matrix, this prevents validating browser-download caching behavior for Edge at all.
The guard system supports combining multiple conditions (e.g., browser + platform), and the suite already provides :platform and :browser conditions.
- rb/spec/integration/selenium/webdriver/driver_finder_spec.rb[54-56]
Decide the intended behavior:
- If Edge should only be excluded on specific platforms (e.g., Windows where Edge may already be present), make the exclusion conditional, e.g. add
platform: :windows(or the relevant platform(s)). - Otherwise, remove
:edgefrom the exclusion list and/or update the guard reason to reflect the actual rationale for excluding Edge from browser-download caching verification.
[maintainability] Unconditional debug logging
Unconditional debug logging
`SpecSupport::Helpers#includes_path?` unconditionally calls `warn`, emitting debug output (including full paths) to STDERR on every invocation. This will spam CI logs during the browser-matrix integration tests and can obscure real warnings or contribute to log-size related flakiness.includes_path? prints a warn "DEBUG ..." line unconditionally, which pollutes integration test output.
This helper is used by the new DriverFinder integration spec and will be called repeatedly across the browser matrix, so the debug output will be amplified in CI.
- rb/spec/integration/selenium/webdriver/spec_support/helpers.rb[149-151]
Remove the warn line entirely, or guard it behind an opt-in flag (e.g., if ENV['SE_DEBUG_INCLUDES_PATH'] == 'true') so normal CI runs are silent.
[correctness] Cache path check fails
Cache path check fails
`SpecSupport::Helpers#includes_path?` converts paths using `Platform.unix_path` and then checks for a `"#{root}/"` prefix, which will be false for Windows-style paths after `unix_path` turns `/` into `\`. This will cause the new DriverFinder cache assertions to fail on Windows even when Selenium Manager correctly downloads into the cache.includes_path? currently mixes Windows normalization (via Platform.unix_path, which normalizes to OS separators) with a hard-coded '/' prefix check. On Windows this makes the check always return false.
-
Platform.unix_pathnormalizes by convertingFile::ALT_SEPARATORtoFile::SEPARATOR. On Windows, that converts/to\\. -
includes_path?then assumes/separators when doingstart_with?("#{root}/").
- rb/spec/integration/selenium/webdriver/spec_support/helpers.rb[146-150]
Update includes_path? to normalize both path and root to a consistent separator before comparing, e.g.:
- Convert backslashes to forward slashes using
tr('\\', '/')(works cross-platform). -
File.expand_pathboth values to avoid relative-path surprises. - On Windows, consider case-insensitive comparison for drive letters (
downcase) beforestart_with?.
Example implementation:
def includes_path?(path, root)
path = File.expand_path(path).tr('\\', '/')
root = File.expand_path(root).tr('\\', '/').chomp('/')
if WebDriver::Platform.windows?
path = path.downcase
root = root.downcase
end
path.start_with?("#{root}/")
end| PR 17594 (2026-05-30) |
[correctness] pkg_archive deletes extracted tree
pkg_archive deletes extracted tree
`_pkg_archive_impl` unconditionally deletes `pkg_name` after the (optional) `move` loop, so when `move` is empty the rule removes the only extracted payload and leaves the repository with no usable contents. This is a functional regression because `move` is not a mandatory attribute.pkg_archive always deletes the expanded package directory (pkg_name) at the end. Since move is optional, a caller can legally omit it, in which case nothing is moved out and the extracted tree is deleted, leaving an empty external repository.
The rule’s attrs define move as a non-mandatory attr.string_dict(), so an empty dict is a valid configuration.
- common/private/pkg_archive.bzl[50-59]
- common/private/pkg_archive.bzl[60-71]
- Always delete the downloaded archive (
download_name). - Only delete the expanded directory (
pkg_name) ifmoveis non-empty (i.e., you actually moved desired artifacts out), or makemovemandatory / explicitlyfail()whenmoveis empty so the rule can’t silently delete everything.
| PR 17576 (2026-05-27) |
[correctness] `check_error_with_driver_in_path` now errors
`check_error_with_driver_in_path` now errors
`check_error_with_driver_in_path` now suppresses errors only when `*is_driver_in_path` is true and `self.get_browser_path().is_empty()`, whereas it previously suppressed errors whenever `*is_driver_in_path` was true. Because `browser_path` can be populated during auto-detection, this changes user-visible behavior for driver-in-PATH fallback and can cause previously-successful setups to abort, breaking upgrades for consumers relying on the prior non-failing behavior.check_error_with_driver_in_path() has changed fallback semantics: it now propagates errors when *is_driver_in_path is true but get_browser_path() is non-empty, whereas previously driver-in-PATH mode would swallow these errors and continue. Because browser_path is frequently populated by auto-detection (not just explicit user input like --browser-path), this additional predicate can cause setup flows that intend to fall back to a PATH driver to abort during later failures (e.g., driver version discovery), creating a breaking behavior change for upgrades.
-
setup()computesuse_driver_in_pathbased on detecting a driver in PATH and the original browser version being empty, then routes errors from discovery/download/version steps throughcheck_error_with_driver_in_path(). - Browser discovery/version logic calls
detect_browser_path()whenbrowser_pathis empty;detect_browser_path()updates configuration viaset_browser_path(), soget_browser_path()reflects derived/auto-detected state, not solely user-provided state. - The PR’s main goal is switching command execution to argv, but this extra gating changes error/fallback semantics relied on by
use_driver_in_pathflows and may violate the “no breaking changes on upgrade” requirement.
- rust/src/lib.rs[804-854]
- rust/src/lib.rs[950-964]
[observability] Trace logs empty browser path
Trace logs empty browser path
The trace log at line 1177 fires before `detect_browser_path()` is called (lines 1178-1181), so when `get_browser_path()` returns an empty string the trace message shows an empty path, giving a misleading diagnostic when `--trace` is used to debug browser detection failures.The trace log *** Browser path {} is emitted before detect_browser_path() is called, so it always shows the raw (often empty) value from get_browser_path() rather than the resolved path that will actually be used.
In general_discover_browser_version, browser_path starts as self.get_browser_path().to_string() (empty when no --browser-path flag is given). The trace is logged immediately, then detect_browser_path() may populate it. The trace should reflect the final resolved path.
- rust/src/lib.rs[1175-1181]: Move the
.trace(...)call to after theif browser_path.is_empty()block (i.e., after line 1181), so it logs the path that will actually be used for version probing.
[correctness] Broken version argv flag
Broken version argv flag
On non-Windows, `general_discover_browser_version` passes `cmd_version_arg` as a literal argv element, but Chrome/Edge/Firefox callers pass the format templates `DASH_DASH_VERSION`/`DASH_VERSION` (e.g. "{}{}{} --version"), so the browser is invoked with an invalid argument and version discovery will fail.general_discover_browser_version now builds a direct std::process::Command with args = [cmd_version_arg]. However, current callers pass DASH_VERSION / DASH_DASH_VERSION, which are format templates (contain {} placeholders) and were previously only safe because the old code formatted them into a full command string before going through sh -c.
This makes non-Windows browser version discovery invoke the browser with an invalid literal argument like "{}{}{} --version".
-
DASH_VERSION/DASH_DASH_VERSIONare currently defined as format templates. - Chrome/Edge/Firefox pass these constants into
general_discover_browser_version. - The non-Windows path now directly uses
cmd_version_arg.to_string()as an argv value.
- rust/src/lib.rs[87-88]
- rust/src/lib.rs[1169-1211]
- rust/src/chrome.rs[281-287]
- rust/src/edge.rs[165-189]
- rust/src/firefox.rs[194-200]
Option A (minimal): Change the constants to be actual flags:
DASH_VERSION = "-v"DASH_DASH_VERSION = "--version"
Option B (API cleanup): Change general_discover_browser_version to accept Vec<String> (or &[&str]) for version args, and update each manager to pass vec!["--version".into()] (or vec!["-v".into()]).
| PR 17575 (2026-05-27) |
[reliability] Cache triggers miss lockfiles
Cache triggers miss lockfiles
The gh-cache workflow’s push path filter omits pnpm-lock.yaml and multitool.lock.json, so updates to these Bazel dependency inputs won’t trigger cache population for macOS/Windows. Because MODULE.bazel uses these files to generate external repositories, CI may still need to download new artifacts (and hit the same transient 502/cert failures) until the next scheduled run..github/workflows/gh-cache.yml uses a push.paths filter to decide when to repopulate the GitHub cache, but it does not include key dependency inputs (pnpm-lock.yaml, multitool.lock.json) that are used by Bazel to generate external repositories. This means macOS/Windows cache population will not run when those files change, leaving caches stale.
MODULE.bazel references both //:multitool.lock.json (rules_multitool hub) and //:pnpm-lock.yaml (npm_translate_lock). These files directly affect what external assets Bazel will download.
- .github/workflows/gh-cache.yml[4-15]
Add the missing files to the on.push.paths list (at minimum pnpm-lock.yaml and multitool.lock.json). Consider also including other npm-related inputs referenced by npm_translate_lock (e.g. package.json, pnpm-workspace.yaml, .npmrc, and relevant javascript/**/package.json) if you want cache population to run immediately when those change.
| PR 17571 (2026-05-25) |
[maintainability] RBS missing browser_name=
RBS missing browser_name=
`Safari::Options` now defines `browser_name=` at runtime, but `rb/sig/.../safari/options.rbs` only declares `browser_name` and not the setter, so typed consumers cannot call `options.browser_name = ...` without type errors.Selenium::WebDriver::Safari::Options defines a browser_name= setter in Ruby, but the RBS signature file does not declare this method. This creates a mismatch between runtime API and the typed interface.
The PR introduced/updated Safari::Options#browser_name= in Ruby, and updated the RBS to include browser_name, but the setter signature was not added.
- rb/lib/selenium/webdriver/safari/options.rb[39-45]
- rb/sig/lib/selenium/webdriver/safari/options.rbs[13-16]
Add a setter signature in rb/sig/lib/selenium/webdriver/safari/options.rbs, e.g.:
-
def browser_name=: (String value) -> void(or returnStringif you want to model Ruby’s assignment return value).
[reliability] Test clobbers `SE_CHROMEDRIVER`
Test clobbers `SE_CHROMEDRIVER`
The new DriverFinder env-precedence unit test overwrites `ENV['SE_CHROMEDRIVER']` and then unconditionally deletes it in `ensure`, which can clobber a pre-existing value from a developer/CI environment and leak side effects into later examples. This makes the suite potentially order-dependent and flaky across environments, undermining the expectation that tests remain isolated and reliable.The DriverFinder env-precedence spec sets ENV['SE_CHROMEDRIVER'] and then unconditionally calls ENV.delete('SE_CHROMEDRIVER') in ensure, which can wipe a pre-existing value from the parent environment and leak state across examples.
This test is intended to validate env-var precedence, but to comply with the expectation that test changes are reliable across environments (PR Compliance ID 13) and to avoid order-dependent failures, it should not permanently mutate global process environment. Other unit specs in this repo follow a safer pattern by preserving/restoring prior env values rather than always deleting them.
- rb/spec/unit/selenium/webdriver/common/driver_finder_spec.rb[36-48]
[correctness] `Safari.path` now nil
`Safari.path` now nil
`Selenium::WebDriver::Safari.path` now defaults to `nil`, removing the previous behavior of providing/validating a default Safari binary path and raising actionable errors. This is a user-visible behavior change that can break callers relying on a non-nil default or on the earlier deterministic exceptions.Safari.path now memoizes nil, changing the public method’s behavior so it no longer provides a default Safari path nor performs early validation/OS checks. This is a backward-incompatible, user-visible behavior change.
Safari.path is a public singleton method. With the new implementation, any caller expecting a default path (or early, actionable errors) will instead receive nil and may fail later with less clear errors.
- rb/lib/selenium/webdriver/safari.rb[48-50]
- Reintroduce the previous default path and validation/error behavior (e.g., default to the Safari binary on macOS and raise a clear
WebDriverErrorwhen unsupported/unavailable). - If the new
nildefault is intentional, add a deprecation path and update callers/documentation accordingly so upgrades do not silently change behavior.
[correctness] `Service#env_path` removed
`Service#env_path` removed
`Selenium::WebDriver::Service#env_path` was removed without a deprecation path, which is a breaking change for callers and leaves the RBI/RBS contract inconsistent with the runtime implementation. This can trigger runtime `NoMethodError` for existing consumers and also break RBS/Steep validation during upgrades.Selenium::WebDriver::Service#env_path was removed from the Ruby implementation without a deprecation path, but the corresponding RBS signature still declares it, creating an API break for callers and leaving the type/signature contract out of sync with runtime behavior.
- The Ruby signature file still exposes
env_pathas a public method (def env_path). - The
Serviceimplementation no longer definesenv_path, so existing callers will hitNoMethodErrorat runtime. - The mismatch will also fail RBS/Steep (or other signature validation) because the method is declared but not implemented.
- Compliance requires maintaining API compatibility and deprecating public functionality before removal.
- rb/lib/selenium/webdriver/common/service.rb[89-105]
- rb/sig/lib/selenium/webdriver/common/service.rbs[48-60]
[reliability] No test for `env_path`
No test for `env_path`
The new behavior that prioritizes the driver environment variable in `DriverFinder#paths` is not covered by unit tests. This risks regressions and CI behavior differences across environments.DriverFinder now checks an environment variable (DRIVER_PATH_ENV_KEY) when resolving the driver path, but there is no unit test asserting this precedence and behavior.
There are existing unit tests for DriverFinder behavior (class path, proc path, Selenium Manager calls), so adding a small/unit test for ENV precedence is feasible and keeps tests aligned with the changed behavior.
- rb/lib/selenium/webdriver/common/driver_finder.rb[50-68]
- rb/spec/unit/selenium/webdriver/common/driver_finder_spec.rb[24-99]
[correctness] Env var Service init regression
Env var Service init regression
Service#initialize no longer applies the DRIVER_PATH_ENV_KEY environment variable, but existing unit specs expect Service.new.executable_path to reflect that env var immediately. This will fail multiple unit tests and is a behavioral break for any caller reading executable_path before DriverFinder/launch runs.Service#initialize no longer reads the driver-path environment variable (e.g., SE_CHROMEDRIVER) to populate @executable_path. The repo’s unit specs currently assert that Service.new.executable_path is derived from the env var, so the change will cause test failures and breaks the previously-tested behavior.
The env-var lookup was removed from Service#initialize and moved into DriverFinder. That makes env vars apply later (when DriverFinder runs), but it also means service.executable_path stays nil after construction.
Choose one consistent behavior and align code + tests:
-
Preserve existing behavior (least breaking): make
Service#executable_path(reader) fall back to ENV when@executable_pathis nil (or re-introduce env assignment ininitialize). -
If behavior change is intended: update the affected unit specs to assert env usage via
DriverFinderorService#launch, not viaService#executable_pathimmediately afternew.
Relevant locations:
- rb/lib/selenium/webdriver/common/service.rb[69-92]
- rb/spec/unit/selenium/webdriver/chrome/service_spec.rb[153-173]
- rb/spec/unit/selenium/webdriver/safari/service_spec.rb[118-138]
- rb/spec/unit/selenium/webdriver/edge/service_spec.rb[163-183]
- rb/spec/unit/selenium/webdriver/firefox/service_spec.rb[215-236]
- rb/spec/unit/selenium/webdriver/ie/service_spec.rb[153-173]
- rb/spec/unit/selenium/webdriver/common/driver_finder_spec.rb[22-99]
| PR 17570 (2026-05-25) |
[reliability] Cache listing not paginated
Cache listing not paginated
`prune-codeql-caches.sh` fetches only 100 caches (`gh cache list --limit 100`), so any matching caches beyond that are never considered for deletion. Since the script itself notes CodeQL creates a new cache per commit, older caches can accumulate beyond 100 and remain unpruned.The script only lists up to 100 caches and prunes within that partial set, which can leave most stale caches untouched once the repository accumulates more than 100 CodeQL caches.
The script comments say CodeQL creates a new overlay-base-database cache per commit; on an active repository, that can exceed 100 entries quickly.
- scripts/github-actions/prune-codeql-caches.sh[3-6]
- scripts/github-actions/prune-codeql-caches.sh[24-28]
- Replace the single
gh cache list ... --limit 100call with logic that retrieves all matching caches.- Option A: increase
--limitto the maximum supported bygh(if sufficiently high for this repo’s scale). - Option B (more robust): implement pagination via the GitHub API (
gh api) to iterate through all pages of caches, building therowsarray from the full result set.
- Option A: increase
- Keep the rest of the grouping/deletion logic unchanged so behavior stays “keep newest per group”.
| PR 17569 (2026-05-25) |
[reliability] Disk status step can fail
Disk status step can fail
`disk-status.sh` runs `du -sh "$path"/* | sort -h` when the directory exists, which returns a non-zero status when the glob doesn’t match (e.g., empty directory), and this script is sourced directly inside multiple workflow steps. That non-zero can abort those steps, turning an observability checkpoint into a CI failure.scripts/github-actions/disk-status.sh is sourced from workflow steps. Inside measure(), the du pipelines can exit non-zero (e.g., when $path/* doesn’t match because the directory is empty, or due to permissions). When sourced, that failure can propagate and fail the workflow step.
This script is intended for diagnostics only; it should be best-effort and never fail CI.
- scripts/github-actions/disk-status.sh[29-37]
- .github/workflows/bazel.yml[206-214]
- .github/workflows/bazel.yml[239-247]
- .github/workflows/bazel.yml[293-300]
| PR 17565 (2026-05-24) |
[reliability] `with_env` deletes preexisting ENV
`with_env` deletes preexisting ENV
The new tests call `with_env`, which unconditionally deletes ENV keys on cleanup instead of restoring any pre-existing values. This can clobber global process state (e.g., `HTTP_PROXY`/`NO_PROXY`) and cause cross-test side effects or flaky behavior.with_env mutates ENV but its cleanup unconditionally deletes keys, which can remove pre-existing environment variables and leak global state changes across tests.
New tests added in default_spec.rb rely on with_env to set http_proxy and no_proxy/NO_PROXY. The helper currently deletes those keys in ensure rather than restoring prior values (or leaving them unset only if they were originally absent).
- rb/spec/unit/selenium/webdriver/spec_helper.rb[38-43]
- rb/spec/unit/selenium/webdriver/remote/http/default_spec.rb[127-139]
| PR 17559 (2026-05-23) |
[reliability] Unsafe cast of `se:browserName`
Unsafe cast of `se:browserName`
`DriverFinder.toArguments()` directly casts `options.getCapability("se:browserName")` to `String`, which can throw `ClassCastException` if the capability is non-string or protocol-derived and thereby break Selenium Manager-based driver discovery. This violates the requirement to validate external/protocol inputs early and fail with deterministic, actionable exceptions (or safely fall back).DriverFinder.toArguments() reads the se:browserName capability via an unchecked cast to String. Since capability values are Object and may be protocol-/user-provided (including the new Electron override), a non-String value can trigger a ClassCastException, breaking Selenium Manager-based driver discovery instead of producing a deterministic, actionable validation failure or a safe fallback.
Capabilities are externally influenced (protocol/user) and getCapability() returns Object, so DriverFinder should validate type and null/emptiness before using se:browserName. When the value is absent/invalid, it should either ignore it and fall back to options.getBrowserName(), or throw a clear IllegalArgumentException describing the expected type/value; optionally, if the value is present but not a String, log a warning and ignore it.
- java/src/org/openqa/selenium/remote/service/DriverFinder.java[133-138]
[correctness] Brittle Electron detection
Brittle Electron detection
DriverFinder infers Electron mode by substring-matching the Capabilities implementation class name, which can silently break Electron driver discovery when options are subclassed or wrapped (and can misclassify unrelated capabilities). This can cause Selenium Manager to be invoked with the wrong `--browser` value and return an incompatible driver/browser pairing.DriverFinder.toArguments() uses options.getClass().getName().toLowerCase().contains("electron") to decide whether to pass --browser electron to Selenium Manager. This is brittle: it fails for subclasses/wrappers (false negatives) and can misclassify unrelated capabilities classes (false positives).
DriverFinder accepts a generic Capabilities instance, so relying on the implementation type name is not a stable contract.
- java/src/org/openqa/selenium/remote/service/DriverFinder.java[133-138]
- java/src/org/openqa/selenium/chrome/ElectronOptions.java[46-64]
- java/test/org/openqa/selenium/remote/service/DriverFinderTest.java[198-218]
- Add an explicit, Selenium-namespaced marker capability in
ElectronOptions(e.g.,se:electron = true) and ensure it survivesmerge(). - Update
DriverFinder.toArguments()to check that marker (e.g.,Boolean.TRUE.equals(options.getCapability("se:electron"))) instead of class-name matching. - Update/add tests:
- Existing ElectronOptions test should assert the marker capability is present.
- DriverFinderTest should assert
--browser electronis sent when the marker is present, and that wrapping inImmutableCapabilitiesstill works.
| PR 17550 (2026-05-22) |
[security] Unvalidated `targets` shell injection
Unvalidated `targets` shell injection
`workflow_dispatch` introduces a user-provided `targets` input that is later interpolated into a shell command without validation/quoting, enabling command injection and unsafe argument splitting. This violates the CI/script hardening requirement and can lead to unexpected command execution or incorrect Bazel invocation.A workflow_dispatch string input (inputs.targets) is user-controlled and later inserted directly into a bazel test shell command, allowing shell metacharacters (e.g., ;, &&) to be interpreted as commands.
This workflow now accepts targets via manual dispatch. The value is used in the os-tests job’s run script, so it must be handled as data (arguments) rather than executable shell text.
- .github/workflows/ci-ruby.yml[10-15]
- .github/workflows/ci-ruby.yml[73-80]
- Read
inputs.targetsinto a variable inside quotes. - Split into a bash array safely (e.g.,
read -r -a targets_arr <<< "$targets"). - Invoke Bazel with
"${targets_arr[@]}"so each target is a separate argument and shell metacharacters are not executed. - Optionally validate that each entry matches expected Bazel target patterns (e.g., starts with
//).
[correctness] IE remote tests included
IE remote tests included
The Windows `os-tests` job uses `--test_tag_filters=edge,edge-remote,unit,skip-rbe,-ie`, but IE remote targets are tagged `ie-remote` (not `ie`) and also include `skip-rbe`, so they can still be selected and executed on Windows despite the `-ie` exclusion.The Windows os-tests matrix entry tries to exclude IE by adding -ie, but IE remote targets use the tag ie-remote and inherit skip-rbe, so they still match the positive filters.
rb_integration_test generates both local and *-remote test targets, and remote targets inherit the browser-specific tag list (including skip-rbe for IE).
- .github/workflows/ci-ruby.yml[57-60]
Update the Windows tag-filters to also exclude the remote IE tag, e.g.:
edge,edge-remote,unit,skip-rbe,-ie,-ie-remote
(Alternatively, remove skip-rbe from the Windows include list if it is not intended to pull in unrelated skip-rbe tests.)
[reliability] grep exits read-targets
grep exits read-targets
In ci.yml, process_binding assigns lang_targets via a pipeline containing grep; when there are no matching Ruby targets, grep exits non-zero and can fail the entire “Read targets” step instead of simply producing no rb_targets output.process_binding() uses grep inside a command substitution without guarding for the “no matches” exit code. When no Ruby targets exist, this can terminate the step early and fail CI.
The workflow should continue and simply omit rb_targets when there are no matching targets.
- .github/workflows/ci.yml[81-100]
- Make the pipeline tolerant of no matches, e.g. append
|| trueto the whole pipeline, or usegrep ... || :.- Example:
lang_targets=$(echo "$targets" | tr ' ' '\n' | grep -E "^${pattern}[:/]" | tr '\n' ' ' | sed 's/ *$//' || true)
- Example:
- Consider switching
echotoprintf '%s\n' "$targets"to avoid echo edge cases.
[correctness] Dispatch missing targets input
Dispatch missing targets input
ci-ruby.yml uses `${{ inputs.targets }}` in the os-tests Bazel command, but `targets` is only defined under `workflow_call`; a manual `workflow_dispatch` run can expand to an empty target list and cause `bazel test` to fail.os-tests references inputs.targets, but targets is only declared for workflow_call. For workflow_dispatch runs, this input is not defined, so the Bazel command may receive no explicit targets.
The workflow declares workflow_dispatch: with no inputs, but still relies on inputs.targets.
- .github/workflows/ci-ruby.yml[3-11]
- .github/workflows/ci-ruby.yml[47-76]
Add workflow_dispatch.inputs.targets mirroring the workflow_call input (same default //rb/...), or change the command to fall back to //rb/... when inputs.targets is empty.
[reliability] Bazel exit code 4 ignored
Bazel exit code 4 ignored
The `os-tests` job treats `bazel test` exit code `4` as success, which can mask misconfiguration (e.g., filters producing “no tests found”) and allow CI to go green without running any Ruby tests. This violates the requirement to avoid error-swallowing constructs in CI/scripts that hide failures.The os-tests Bazel invocation converts exit code 4 into success (exit 0). This can hide “no tests found” situations caused by incorrect target selection or tag filters, resulting in false-green CI.
bazel test exit code 4 typically indicates that no tests were executed. Treating this as success in PR CI can allow Ruby changes to merge without test coverage.
- .github/workflows/ci-ruby.yml[74-76]
[correctness] Boolean passed as browser
Boolean passed as browser
ci-ruby.yml sets `matrix.browser: true` (boolean) and passes it into bazel.yml where `browser` is declared as a string input; this can break reusable-workflow input validation or lead to unexpected conditional behavior (`inputs.browser != ''`).matrix.browser is set to YAML boolean true for ubuntu/windows, then forwarded to reusable workflow input browser which is typed as string. This risks workflow-call validation errors and also changes behavior of steps gated on inputs.browser != ''.
The called workflow declares browser as type: string.
- .github/workflows/ci-ruby.yml[52-66]
- .github/workflows/bazel.yml[19-28]
- Change
browser: trueto a string value:- If you want “no specific browser but enable UI setup”, use
browser: 'enabled'(string) and update bazel.yml docs accordingly. - If you actually intend Edge on Windows, set
browser: 'edge'. - If Ubuntu is unit-only, set
browser: ''and adjust tag filters accordingly.
- If you want “no specific browser but enable UI setup”, use
- Keep macOS as
browser: safarifor safaridriver enablement.
| PR 17533 (2026-05-20) |
[maintainability] Useless ImmutableArray null-check
Useless ImmutableArray null-check
BiDi.SubscribeAsync/StreamAsync call ArgumentNullException.ThrowIfNull(descriptors) even though ImmutableArray is a value type and cannot be null, so this guard never triggers and can mislead about validation coverage. Callers passing default/empty arrays will fail later with a different exception path instead of a clear upfront argument validation.ArgumentNullException.ThrowIfNull(descriptors) is ineffective for ImmutableArray<T> parameters because ImmutableArray<T> is a struct (non-nullable value type). This can hide the real validation intention and doesn’t provide a clear exception for empty/default arrays.
The affected APIs are public entry points (IBiDi / BiDi) for multi-descriptor subscription/stream creation.
-
Replace
ThrowIfNull(descriptors)with explicit validation for default/empty immutable arrays (e.g.,descriptors.IsDefaultOrEmptyordescriptors.Length == 0) and throwArgumentExceptionwith a clear message. -
Apply the same validation to both
SubscribeAsync(... ImmutableArray<EventDescriptor> ...)overloads andStreamAsync(... ImmutableArray<EventDescriptor> ...). -
dotnet/src/webdriver/BiDi/BiDi.cs[104-132]
| PR 17522 (2026-05-19) |
[reliability] Shared driver disposed unsafely
Shared driver disposed unsafely
DriverTestFixture disposes the shared EnvironmentManager-held driver without clearing EnvironmentManager's internal driver reference, so later GetCurrentDriver() calls can return a disposed driver. ResetOnError() also disposes before calling CreateFreshDriver(), which then calls CloseCurrentDriver()/Quit on an already-disposed instance and can throw, masking the real test failure.DriverTestFixture currently calls driver.Dispose() in [OneTimeTearDown] and in ResetOnError(). Because the driver is owned/tracked by EnvironmentManager (singleton), disposing the driver directly can leave EnvironmentManager holding a disposed instance and cause later calls to reuse or quit a disposed driver.
-
EnvironmentManager.GetCurrentDriver()returns the cacheddriverfield if non-null. -
EnvironmentManager.CloseCurrentDriver()is the method that both quits the driver and clears the cached reference (driver = null).
- dotnet/test/webdriver/DriverTestFixture.cs[128-148]
- In
DriverTestFixture.TearDown()(the[OneTimeTearDown]), replacedriver?.Dispose();withEnvironmentManager.Instance.CloseCurrentDriver();(and optionally setdriver = null;for clarity). - In
ResetOnError(), remove the explicitdriver?.Dispose();and just calldriver = EnvironmentManager.Instance.CreateFreshDriver();(sinceCreateFreshDriver()already closes the current driver viaCloseCurrentDriver()). - Ensure there is no remaining path where
EnvironmentManager.drivercan remain non-null while the underlying driver is disposed.
| PR 17519 (2026-05-19) |
[performance] Repeated Bazel gen runs
Repeated Bazel gen runs
Multi-targeting Selenium.Support causes Selenium.WebDriver to be built once per TFM via ProjectReference, and WebDriver’s GenerateBazelArtifacts runs an unconditional `bazel build ...` before every CoreCompile, multiplying the Bazel invocation count (and build time) for Support builds/packs.dotnet build/pack of Selenium.Support.csproj is now cross-targeting (net462/netstandard2.0/net8.0). Because it has a ProjectReference to Selenium.WebDriver.csproj, WebDriver is built per target framework, and WebDriver’s GenerateBazelArtifacts target executes bazel build ... before every CoreCompile.
This leads to redundant Bazel builds (previously Support was single-TFM), significantly increasing build time and increasing exposure to intermittent Bazel/env failures during normal MSBuild workflows.
- Support is now multi-targeted and project-references WebDriver.
- WebDriver runs Bazel generation unconditionally in a
BeforeTargets="CoreCompile"target.
- Gate Bazel generation so it runs once per multi-target build (outer build) while still running for single-TFM builds.
- A typical approach is to:
- Add a target that runs when
$(IsCrossTargetingBuild) == 'true'(outer build) before inner builds dispatch, generating artifacts once. - Keep a separate path for single-target builds (
$(IsCrossTargetingBuild) != 'true') that runs beforeCoreCompile.
- Add a target that runs when
- Ensure generated files exist before compilation in all cases.
Fix focus areas (files/lines):
- dotnet/src/support/Selenium.Support.csproj[3-41]
- dotnet/src/webdriver/Selenium.WebDriver.csproj[62-69]
| PR 17504 (2026-05-18) |
[reliability] `git ls-remote origin` without checkout
`git ls-remote origin` without checkout
The `detect-branch` job runs `git ls-remote ... origin ...` without checking out the repository or otherwise configuring an `origin` remote, so the step is likely to fail or produce an incorrect `exists` output that prevents downstream `evaluate`/`commit-pins`/`promote` jobs from running even when the target branch exists. This violates the requirement to harden CI/scripts so they behave deterministically and robustly.The detect-branch job calls git ls-remote ... origin ... even though no repository checkout (or equivalent remote setup) occurs, so origin is undefined and the branch-existence detection becomes unreliable; this can incorrectly skip downstream evaluate/commit-pins/promote jobs.
detect-branch runs on a fresh GitHub-hosted runner; without actions/checkout, there is no local git repository and no configured remote named origin, so git ls-remote origin ... can fail or yield incorrect results. Other jobs/workflows that successfully use git ls-remote ... origin do so after actions/checkout, which configures origin, indicating the missing prerequisite here.
- .github/workflows/renovate-dependency-pr.yml[25-44]
- .github/workflows/renovate-dependency-pr.yml[40-44]
[reliability] `gh api` output quoting broken
`gh api` output quoting broken
In `renovate-dependency-pr.yml`, the `detect-branch` step uses nested, unescaped double-quotes inside an `echo "exists=$(...)"` command substitution when writing to `$GITHUB_OUTPUT`, which can break shell parsing and cause the workflow to fail before setting the `exists` output. Because downstream evaluation/pinning/promotion is gated on `steps.detect.outputs.exists`, this can lead to jobs being skipped or failing, violating the requirement to harden CI scripts with safe, deterministic shell handling.The detect-branch step writes exists=... to $GITHUB_OUTPUT using a command substitution containing nested, unescaped double-quotes (via echo "true" / echo "false" inside an outer echo "exists=$(...)"), which can break shell parsing and prevent the exists output from being set, causing downstream jobs gated on this value to skip or fail.
This is part of a CI workflow and must be robust and deterministic (per PR Compliance ID 17) to avoid silent failures or broken runs. The problematic pattern is effectively echo "exists=$( ... && echo "true" || echo "false")", where the inner quotes can prematurely terminate the outer string; downstream logic depends on needs.detect-branch.outputs.exists being 'true' for the evaluate job (and related evaluation/pinning/promotion) to run.
- .github/workflows/renovate-dependency-pr.yml[35-46]
[reliability] PR promotion token missing
PR promotion token missing
`renovate-dependency-pr.yml` declares `SELENIUM_CI_TOKEN` as optional but passes it as the `token` for `peter-evans/create-pull-request@v8` without fallback, which can cause promotion to fail authentication when the secret isn’t set/passed. This is inconsistent with other workflows in the repo that already use `secrets.SELENIUM_CI_TOKEN || github.token` for safe fallback.The promote step uses ${{ secrets.SELENIUM_CI_TOKEN }} directly even though the secret is declared required: false, so the token may be empty and PR creation will fail.
Other workflows (including this PR’s renovate job and the reusable commit-changes.yml) already use a || github.token fallback.
- .github/workflows/renovate-dependency-pr.yml[14-17]
- .github/workflows/renovate-dependency-pr.yml[78-83]
Either:
- Change to
token: ${{ secrets.SELENIUM_CI_TOKEN || github.token }}in thecreate-pull-requeststep, or - Mark
SELENIUM_CI_TOKENasrequired: trueif fallback is not acceptable.
[reliability] Temp branch not ensured
Temp branch not ensured
`renovate-dependencies.yml` always checks out `temp/bazel-updates`, but the reusable `commit-changes.yml` workflow explicitly skips the push when `changes.patch` is missing/empty. When the Bazel update step produces no diff (and `bazel.yml` deletes `changes.patch`), the temp branch may not exist or may remain stale from a previous run, causing checkout failures or Renovate to run on old content.The workflow assumes temp/bazel-updates exists and reflects the current base-ref, but the branch is only pushed when changes.patch is non-empty. If there are no Bazel update changes, the branch is not pushed/created, yet the Renovate job still checks it out.
-
bazel.ymldeleteschanges.patchwhen empty. -
commit-changes.ymlskips commit/push whenchanges.patchis empty. -
renovate-dependencies.ymlalways checks outtemp/bazel-updates.
- .github/workflows/renovate-dependencies.yml[36-63]
- .github/workflows/commit-changes.yml[46-56]
- .github/workflows/bazel.yml[274-287]
Ensure temp/bazel-updates is force-updated to the current base-ref even when there are no patch changes, for example:
- Update
commit-changes.ymlto stillgit push --force origin HEAD:"$PUSH_BRANCH"in theelsebranch (no patch), or - Add a dedicated step/job in
renovate-dependencies.ymlthat force-pushesinputs.base-reftotemp/bazel-updateswhen no patch is present, and gate the Renovate checkout accordingly.
| PR 17503 (2026-05-18) |
[security] `PUSH_BRANCH` defaults to `github.ref_name`
`PUSH_BRANCH` defaults to `github.ref_name`
The workflow checks out `${{ inputs.ref || github.ref }}` but defaults `PUSH_BRANCH` to `github.ref_name`, which can diverge when callers pass an explicit `inputs.ref` (notably on `pull_request`, where `github.ref_name` can be a merge ref). This mismatch can cause `git push --force origin HEAD:"$PUSH_BRANCH"` to push the checked-out HEAD into an unintended branch, risking accidental branch overwrite.The reusable workflow checks out inputs.ref || github.ref but defaults PUSH_BRANCH to inputs.push-branch || github.ref_name, which may not match the ref actually checked out (notably on pull_request, where github.ref_name can be a merge ref name). This can cause the workflow to run git push --force origin HEAD:"$PUSH_BRANCH" against the wrong branch.
- The workflow checks out
${{ inputs.ref || github.ref }}but setsPUSH_BRANCH: ${{ inputs.push-branch || github.ref_name }}; if a caller providesinputs.refthat differs from the called workflow run’sgithub.ref_name, the push target becomes incorrect. -
ci-lint.ymlcallscommit-changes.ymlon PRs withref: ${{ github.event.pull_request.head.ref }}and does not setpush-branch, so the called workflow uses the problematic default.
- .github/workflows/commit-changes.yml[39-40]
- .github/workflows/commit-changes.yml[53-53]
- .github/workflows/commit-changes.yml[59-59]
- .github/workflows/commit-changes.yml[39-59]
[reliability] Missing patch treated as ok
Missing patch treated as ok
The new guard `if [ ! -s changes.patch ]; then` treats a missing `changes.patch` the same as an intentionally empty patch, so an artifact download failure/misconfiguration can be silently skipped as “no changes”. Because the download step is `continue-on-error: true`, this can produce a green job without indicating that the expected artifact wasn’t applied.The workflow currently checks only file size (-s) and exits 0 when the patch is missing/empty. With continue-on-error: true on the download step, an actual artifact download failure can be silently treated as a successful no-op.
This reusable workflow is invoked from release/CI workflows where a missing patch may indicate an upstream failure or wrong artifact name.
- .github/workflows/commit-changes.yml[41-59]
- Emit a
::notice::before exiting when no patch is present. - Optionally, distinguish:
-
if [ ! -f changes.patch ]; then ...(artifact missing) -
elif [ ! -s changes.patch ]; then ...(empty patch) and decide whether “missing” should fail or remain a no-op based on intended caller behavior.
-
- If you want to keep it as a no-op, at minimum log clearly that commit/push is being skipped due to missing/empty patch.
[correctness] Output misrepresents push
Output misrepresents push
The reusable workflow output `changes-committed` is documented as "committed and pushed", but the job can push (when `push-branch` is set) while still emitting `committed=false`, causing callers to make incorrect decisions.The workflow output changes-committed is documented as representing whether changes were "committed and pushed", but the step output is derived solely from whether a commit occurred. With the current logic, a push can occur while committed=false, violating the output contract.
- Output documentation indicates both commit and push.
- Push can happen even without a commit when
PUSH_BRANCHis set.
- .github/workflows/commit-changes.yml[24-27]
- .github/workflows/commit-changes.yml[58-69]
- Either:
- Change logic so
committedbecomes true only after a successful push (and ensure push is only attempted when appropriate), or - Update the output description/name (and optionally add a separate
pushedoutput) so consumers can reliably detect what happened.
- Change logic so
[security] Unconditional `git push --force`
Unconditional `git push --force`
The workflow force-pushes `HEAD` to `PUSH_BRANCH` whenever that input is set, even when `changes.patch` is missing/empty and no commit was created. This can reset/overwrite the remote branch history (accidental data loss), especially because the artifact download is allowed to fail without stopping the job.The workflow currently runs git push --force origin HEAD:"$PUSH_BRANCH" whenever PUSH_BRANCH is set, even if changes.patch is missing/empty and no commit was created. This can rewrite the remote branch pointer and clobber remote history (accidental data loss), particularly when the patch artifact download fails but the job continues.
- The workflow tracks whether it created a commit via a
committed=false/trueflag, but the push-to-PUSH_BRANCHpath is not conditioned oncommitted. - The patch/artifact download is allowed to fail (
continue-on-error: true), and commits occur only whenchanges.patchexists/is non-empty. - As a result, setting
inputs.push-branchcan cause a force-push of the currentHEADeven when this run produced no changes (or the artifact was missing), effectively resetting/overwriting the target branch.
- .github/workflows/commit-changes.yml[47-69]
| PR 17499 (2026-05-18) |
[reliability] `pkg_dir` unquoted on Windows
`pkg_dir` unquoted on Windows
The new `pnpm` invocation passes an absolute `--dir` path that may contain spaces, but `Bazel.execute` builds a Windows command string via `cmd.join(' ')` without quoting. This can break the `node:install` task on Windows and violates the safe-argument-handling requirement for scripts.node:install now computes pkg_dir via File.expand_path(...) and passes it to Bazel.execute as --dir <path>. On Windows, Bazel.execute assembles a single command line with cmd.join(' ') (no quoting), so any space-containing path will break argument parsing.
This change introduces a new space-sensitive argument (--dir) that was not present previously.
- rake_tasks/node.rake[117-118]
- rake_tasks/bazel.rb[26-32]
| PR 17497 (2026-05-18) |
[correctness] Wrong exec_root mapping
Wrong exec_root mapping
`exec_root` is derived from `ctx.bin_dir.path` using `split("/bin/")`, which can leave it pointing at the output bin directory (or include `/bin/` at the end) instead of the Bazel execution root, so `/pathmap` may not match and builds remain non-deterministic. This also breaks when the path ends with `/bin` (no trailing slash), since the split won’t remove the bin segment at all.The rule claims to map the Bazel execution root, but computes exec_root from ctx.bin_dir.path using split("/bin/"). This is fragile (won’t strip when the string ends with /bin) and does not reliably represent the execution root, so the /pathmap prefix may not match the paths that end up embedded in PDBs.
Other code in this repo that truly needs the execution root obtains it at runtime via bazel info execution_root and handles Windows path formatting explicitly.
- dotnet/private/sourcelink.bzl[29-81]
- dotnet/private/docfx.bzl[35-49]
- Stop using
ctx.bin_dir.path.split("/bin/"). - Either:
- Derive the execroot/workspace root by stripping from the
/bazel-out/segment (as other repo code does), using a robust operation likersplit/rpartition(and handle both.../binand.../bin/...cases), or - Move execroot detection to execution time (similar to
dotnet/private/docfx.bzlusingbazel info execution_root) by introducing a small wrapper that expands the correct/pathmapargument before invoking the compiler.
- Derive the execroot/workspace root by stripping from the
[reliability] Windows separator mismatch
Windows separator mismatch
The new `/pathmap` uses POSIX separators and is passed even when `is_windows` selects the `.bat` wrapper, but other Windows rules in this repo explicitly convert `/` to `\` before invoking tools. If the compiler wrapper or compiler sees `\`-style paths, the `/pathmap` prefix won’t match, leaving Windows builds non-deterministic.On Windows, the rule still constructs exec_root/pathmap using forward slashes and the POSIX-only split("/bin/"), but the rest of the repo treats Windows tool invocations as requiring backslash paths. This can prevent /pathmap from applying on Windows.
Existing Windows actions convert Bazel-provided paths (which often contain /) into \ before invoking cmd.exe or writing .bat files.
- dotnet/private/sourcelink.bzl[31-43]
- dotnet/private/sourcelink.bzl[78-81]
- dotnet/private/copy_files.bzl[13-31]
- dotnet/private/docfx.bzl[18-24]
- When
is_windowsis true, build a Windows-normalizedpathmap(e.g., convert the computed prefix to\consistently) or ensure the wrapper uses consistent separator normalization for both the paths being mapped and the/pathmapprefix.
| PR 17493 (2026-05-17) |
[reliability] CI may skip all tests
CI may skip all tests
`ci.yml` downloads the `check-targets` artifact with `digest-mismatch: warn` and then treats a missing/unreadable `bazel-targets.txt` as an empty string, which can cause every language job to be skipped. The final `ci-success` gate only fails on failures/cancellations, so a run with all test jobs skipped can still pass.The Read Targets job can succeed while effectively selecting zero test jobs if bazel-targets.txt is missing/unreadable (it falls back to an empty string). Combined with digest-mismatch: warn, this makes it more likely for a corrupted/invalid artifact to allow the workflow to proceed and still report overall success.
Downstream test jobs are guarded by if: needs.read-targets.outputs.<lang> != '', and ci-success only checks for failures/cancellations (skipped jobs do not fail the workflow).
- .github/workflows/ci.yml[68-92]
- .github/workflows/ci.yml[130-141]
Make the Read Targets job fail (or fall back to running a safe default like the full test suite) when bazel-targets.txt is missing/empty/unparseable. Also consider making digest mismatch fatal for the check-targets artifact specifically, since it drives which tests run.
[security] `digest-mismatch` set to `warn`
`digest-mismatch` set to `warn`
Multiple CI workflows configure `actions/download-artifact@v8` with `digest-mismatch: warn`, allowing artifact integrity mismatches to proceed without failing the job and weakening determinism. In release/publish flows this is especially risky because the workflows can immediately publish or upload the downloaded artifacts, potentially distributing corrupted outputs despite a detected mismatch.Workflows set digest-mismatch: warn when downloading artifacts via actions/download-artifact@v8, allowing digest mismatches to pass without failing the workflow; in release/publish jobs, this can result in publishing or releasing artifacts even when an integrity mismatch was detected.
The PR updates actions/download-artifact to v8 and explicitly configures digest mismatch handling to warning; per build hardening guidance (PR Compliance ID 14), integrity mismatches should fail deterministically rather than be downgraded. This is particularly important for jobs that publish to PyPI or upload assets to GitHub releases (and other release-like flows), because they use the downloaded artifacts for distribution without any subsequent integrity check.
- .github/workflows/ci-build-index.yml[27-30]
- .github/workflows/ci-rust.yml[238-277]
- .github/workflows/ci-rust.yml[245-248]
- .github/workflows/ci.yml[69-72]
- .github/workflows/commit-changes.yml[49-52]
- .github/workflows/nightly.yml[80-102]
- .github/workflows/nightly.yml[89-93]
- .github/workflows/pin-browsers.yml[35-38]
- .github/workflows/pre-release.yml[260-264]
- .github/workflows/release.yml[160-229]
- .github/workflows/release.yml[171-175]
- .github/workflows/release.yml[198-203]
- .github/workflows/release.yml[206-210]
- .github/workflows/update-documentation.yml[99-103]
| PR 17490 (2026-05-16) |
[reliability] Tox groups now missing
Tox groups now missing
`py/tox.ini` still configures `dependency_groups = lint/validate`, but this PR removes the corresponding `[dependency-groups]` definitions from `py/pyproject.toml`. As a result, `tox -e linting` and `tox -e validate` will fail (and won’t install/run `ruff` / `validate-pyproject`).py/tox.ini references dependency groups (dependency_groups = lint / validate) that no longer exist because [dependency-groups] was removed from py/pyproject.toml. This breaks tox -e linting and tox -e validate, and also removes the only declarations of ruff and validate-pyproject used by those envs.
-
toxusesdependency_groupsto install tools likeruffandvalidate-pyproject. - This PR deletes the group definitions from
py/pyproject.toml, and the newpy/requirements.txtdoes not include those tools either.
-
Re-add
[dependency-groups]topy/pyproject.tomlwithlintandvalidateentries (includeruffandvalidate-pyproject, and any other needed pins/ranges). -
Stop using dependency groups in tox:
- Update
py/tox.inivalidateandlintingenvs to install from-r requirements_lock.txt(or a dedicated dev-tools requirements file). - Add
ruffandvalidate-pyproject(and any required companions likepackagingif needed) topy/requirements.txt, then regeneratepy/requirements_lock.txt.
- Update
- py/pyproject.toml[37-56]
- py/tox.ini[8-13]
- py/tox.ini[38-43]
- py/requirements.txt[6-22]
| PR 17483 (2026-05-16) |
[correctness] Dotnet format args reordered
Dotnet format args reordered
The wrapper now calls `dotnet format $@`, but repo tooling invokes `//dotnet:format` with positional commands like `style`/`whitespace`/`analyzers` as the first arguments; this changes the resulting CLI shape from `dotnet format style ... ` to `dotnet format style ...`. This is likely to cause parsing failures and break formatting/lint workflows that call `bazel run //dotnet:format -- style ...`.dotnet/private/dotnet_format.bzl changed the invocation from passing user-provided args before the workspace (... format "$@" "$proj") to passing the workspace first (... format "$SOLUTION" "$@"). In this repo, //dotnet:format is invoked with positional commands like style, whitespace, and analyzers (without leading dashes), which are expected to be the first tokens after format.
Existing callers (rake tasks and format scripts) run bazel run //dotnet:format -- style ... and -- whitespace, so the wrapper must preserve the previous ordering where $@ comes before the workspace.
- dotnet/private/dotnet_format.bzl[61-63]
- dotnet/private/dotnet_format.bzl[102-104]
- Unix: change to
"$DOTNET" format "$@" "$SOLUTION"(and adjust the echo if desired) - Windows: change to
"%DOTNET%" format %* "%SOLUTION%"(keeping quoting correct)
| PR 17469 (2026-05-15) |
[reliability] No HTTP timeouts
No HTTP timeouts
maven_stable_release uses Net::HTTP.get without open/read timeouts, so `rake java:update` can hang indefinitely on network stalls. This makes dependency updates flaky and can block CI/developer workflows.maven_stable_release fetches maven-metadata.xml via Net::HTTP.get without any timeouts, which can cause rake java:update to hang indefinitely if Maven Central is slow or the connection stalls.
This rake file already uses Net::HTTP.start with explicit open_timeout and read_timeout for other HTTP calls, so timeouts are an established reliability pattern here.
- rake_tasks/java.rake[310-323]
- rake_tasks/java.rake[101-114]
Replace Net::HTTP.get(URI(url)) with a Net::HTTP.start call that sets use_ssl: true plus open_timeout / read_timeout, and read the response body via Net::HTTP::Get so the call fails fast instead of hanging.
| PR 17466 (2026-05-15) |
[correctness] Lockfile pins old multer
Lockfile pins old multer
javascript/selenium-webdriver/package.json now requires multer ^2.1.1, but pnpm-lock.yaml still resolves javascript/selenium-webdriver’s devDependency multer to 2.0.2, so Bazel/npm_translate_lock will keep using the old version (or fail due to mismatch). This defeats the intended dependency upgrade and can break CI/reproducible builds.javascript/selenium-webdriver/package.json was updated to require multer: ^2.1.1, but the workspace lockfile (pnpm-lock.yaml) still pins javascript/selenium-webdriver to multer 2.0.2. Because Bazel uses npm_translate_lock with pnpm_lock, the dependency upgrade will not be realized (or lock verification may fail due to specifier mismatch).
Bazel’s npm_translate_lock consumes javascript/selenium-webdriver/package.json together with pnpm-lock.yaml. The lockfile must be updated whenever a workspace package.json changes dependency specifiers.
- pnpm-lock.yaml[179-181]
- pnpm-lock.yaml[3005-3008]
- MODULE.bazel[90-104]
- javascript/selenium-webdriver/package.json[35-50]
- Regenerate/update
pnpm-lock.yamlso theimporters.javascript/selenium-webdriver.devDependencies.multerspecifier matches^2.1.1and the resolved version is >=2.1.1. - Ensure the
packages:section no longer pins onlymulter@2.0.2and includes the updated resolved version entry. - Re-run the Bazel JS dependency translation step (if part of your workflow) to confirm it succeeds with the updated lock.
| PR 17463 (2026-05-15) |
[reliability] Missing report validation
Missing report validation
scripts/update_py_deps.py treats missing packages in pip’s JSON report as “no change”, so it can print “All repo tooling requirements are up to date!” even when version resolution didn’t return entries for some/all requested requirements. This can silently leave py/requirements.txt stale while the command exits successfully.scripts/update_py_deps.py builds installed from report.get("install", []) and then uses installed.get(name_normalized); missing entries are treated as unchanged, and if updates stays empty the script prints a success message and exits. This can mask cases where the report is empty/partial and no versions were actually resolved.
This script is used to update pinned versions in py/requirements.txt and is invoked via Bazel (bazel run //scripts:update_py_deps). A silent success with no updates when resolution data is missing makes dependency maintenance unreliable.
- scripts/update_py_deps.py[54-73]
- After parsing the report, validate that every
name_normalizedfrompackagesexists ininstalled.- If any are missing, raise a
RuntimeErrorthat includes a helpful diagnostic (e.g., mention missing names and includeresult.stderrand a truncatedresult.stdout).
- If any are missing, raise a
- Optionally wrap
json.loads(result.stdout)intry/except json.JSONDecodeErrorand raise a clearer error including stderr/stdout context.
[reliability] `pip --report` JSON unvalidated
`pip --report` JSON unvalidated
The new dependency resolution flow parses `pip --report -` output with `json.loads(result.stdout)` and then assumes specific keys exist, which can crash with non-actionable `JSONDecodeError`/`KeyError` if pip emits non-JSON output or the report schema differs. This violates the requirement to validate external/protocol inputs and raise deterministic, descriptive exceptions.scripts/update_py_deps.py parses pip --report - output without validating that the output is valid JSON or that expected keys exist, which can produce unclear crashes (e.g., JSONDecodeError, KeyError).
This script consumes external/protocol-derived data (pip report JSON). Per compliance, failures should be deterministic and actionable (explicit exceptions indicating what was wrong and how to fix).
- scripts/update_py_deps.py[54-56]
- Wrap
json.loads(result.stdout)intry/except json.JSONDecodeErrorand raise aRuntimeError(or custom exception) that includes a short message and relevantstderr/stdoutcontext. - Validate
reportis a dict and thatreport.get("install")is a list. - When building
installed, use.get()accessors (or explicit checks) and raise a descriptive exception if required fields likemetadata.name/metadata.versionare missing.
[reliability] py:update breaks on Windows
py:update breaks on Windows
The py:update rake task now runs //scripts:update_py_deps, but update_py_deps.py hardcodes the venv pip location as /bin/pip, which does not exist on Windows (venv uses Scripts/). This will make ./go py:update fail on Windows environments with a FileNotFoundError when trying to execute pip.py:update now invokes //scripts:update_py_deps, but scripts/update_py_deps.py assumes a POSIX virtualenv layout (venv/bin/pip). On Windows the venv executables live under venv\Scripts\ (e.g., pip.exe), so the script will fail when it tries to run pip.
This regression is introduced by changing py:update from an alias of py:pin to executing //scripts:update_py_deps.
- scripts/update_py_deps.py[34-71]
- rake_tasks/python.rake[127-136]
Update scripts/update_py_deps.py to invoke pip via the venv’s python in a cross-platform way:
- Determine the venv python path with
Scripts/python.exeon Windows andbin/pythonon POSIX, then runpython -m pip .... - Avoid directly calling
venv_dir / "bin" / "pip".
Example sketch:
import os
bin_dir = "Scripts" if os.name == "nt" else "bin"
python = venv_dir / bin_dir / ("python.exe" if os.name == "nt" else "python")
subprocess.run([str(python), "-m", "pip", "install", "-q", "--upgrade", "pip"], ...)
subprocess.run([str(python), "-m", "pip", "install", "-q", *install_names], ...)
subprocess.run([str(python), "-m", "pip", "list", "--format=json"], ...)| PR 17452 (2026-05-13) |
[maintainability] Docs install wrong requirements
Docs install wrong requirements
README.md and py/docs/.readthedocs.yaml still instruct `pip install -r py/requirements.txt`, but after this PR that file no longer includes selenium runtime dependencies. Users (and the ReadTheDocs build) can end up with missing imports when generating docs or running selenium from source.The PR changes py/requirements.txt to be dev-tooling only, but documentation still instructs installing only that file. This omits selenium runtime dependencies and can break doc generation and local-from-source usage.
py/requirements_lock.txt already contains the full resolved set (including runtime deps coming from py/pyproject.toml). Updating docs to install from the lock file (or otherwise install runtime deps as well) keeps a single workflow aligned with the Bazel lock generation.
- README.md[235-242]
- py/docs/.readthedocs.yaml[10-18]
[reliability] Tox missing runtime deps
Tox missing runtime deps
py/tox.ini installs only py/requirements.txt, but this PR removes runtime dependencies (e.g., urllib3[socks], websocket-client, typing_extensions) from that file, so tox docs/typecheck will run with selenium on PYTHONPATH but without its runtime imports installed. This will cause sphinx/mypy failures due to missing modules (e.g., urllib3, websocket).py/tox.ini currently installs dependencies via -r {toxinidir}/requirements.txt for docs and typecheck, but py/requirements.txt is now dev-tooling only and no longer includes selenium runtime deps (urllib3, websocket-client, typing-extensions, etc.). Since tox runs with PYTHONPATH={toxinidir}/. and Sphinx autodoc/mypy will import selenium modules, these environments will fail with missing-module errors.
The combined pinned set (dev + runtime) is already available in py/requirements_lock.txt (generated from both pyproject.toml and requirements.txt via //py:requirements.update).
- py/tox.ini[15-36]
| PR 17450 (2026-05-13) |
[reliability] Rust version leaves stale lock
Rust version leaves stale lock
`rust:version` no longer invokes `rust:update`, but callers like `release_updates` still call `rust:version` without subsequently regenerating `rust/Cargo.lock`. This leaves `rust/Cargo.lock`’s `selenium-manager` version entry unchanged even after updating `rust/Cargo.toml`, making the version bump incomplete.The PR removes the implicit rust:update invocation from rust:version. At least one existing caller (release_updates) invokes rust:version but does not invoke rust:update, so rust/Cargo.lock will no longer be regenerated/synced as part of that flow.
-
rust:updateis explicitly described/implemented as regeneratingrust/Cargo.lock. -
rust/Cargo.lockcontains aselenium-managerpackage entry with an explicit version. -
release_updatescallsrust:versionbut does not callrust:update.
- Rakefile[104-125]
- rake_tasks/rust.rake[26-33]
- rust/Cargo.lock[1858-1861]
- Update
release_updates(and any other flows that require a syncedCargo.lock) to explicitly invokerust:updateafterrust:version. - If repeated invocations are expected in the same process, mirror the previous approach by re-enabling the task before invoking it.
| PR 17442 (2026-05-12) |
[correctness] Approval workflow skipped
Approval workflow skipped
For whitelisted `github.triggering_actor` values, `get-approval.yml` skips `notify`, and `authorize` is gated on `needs.notify.result == 'success'`, so both jobs are skipped and the reusable workflow can conclude as `skipped` instead of `success`. `restrict-trunk.yml` only proceeds with trunk management when the reusable `get-approval` job result is `success` (unless `skip_approval` is set), which can block trunk restriction at the start of pre-release/release workflows.Whitelisted actors cause the Get Approval reusable workflow to run zero jobs (notify is skipped; authorize is skipped because it requires notify success). This can make the reusable workflow job conclude as skipped, which downstream workflows treat as not-approved.
-
restrict-trunk.ymlrequiresneeds.get-approval.result == 'success'to runmanage-trunkwhenrestrict: true. -
pre-release.yml(and release flows) callrestrict-trunk.ymlwithrestrict: true, so askippedapproval job can prevent trunk restriction and cascade into the rest of the workflow being skipped.
- .github/workflows/get-approval.yml[20-46]
- .github/workflows/restrict-trunk.yml[38-55]
- .github/workflows/pre-release.yml[76-89]
Create an explicit success path for whitelisted actors (e.g., an auto-authorize job without an environment) so the called workflow concludes success when the actor is in the whitelist.
Example structure:
- Keep
notifyonly for non-whitelisted actors. - Keep
authorize(withenvironment: production) only for non-whitelisted actors. - Add
auto-authorize(no environment) for whitelisted actors.
This ensures exactly one of authorize / auto-authorize runs, and the reusable workflow result is success in both cases.
[reliability] No unlock on cancel
No unlock on cancel
`unlock-trunk-on-failure` only runs when `failure()` is true, so if the pre-release workflow is cancelled after `restrict-trunk` succeeds, the auto-unlock path will not trigger and trunk can remain restricted. This can leave the repo locked until someone manually runs the Unlock Trunk workflow.The auto-unlock job in pre-release only triggers on failures (failure()), not on cancellations, so a cancelled run after trunk is restricted can leave trunk locked.
Trunk restriction is applied early in the workflow; cancellation mid-run is a plausible operational scenario (manual cancel, timeouts, concurrency cancellations), and leaving trunk restricted requires manual intervention.
- .github/workflows/pre-release.yml[339-349]
Adjust the condition to also cover cancellations, e.g.:
if: (failure() || cancelled()) && needs.restrict-trunk.result == 'success'
If you find job-level cancellation prevents this job from starting in your environment, consider moving unlock into a separate workflow_run cleanup workflow that triggers on completion of the pre-release workflow and unlocks when the conclusion is failure or cancelled and trunk had been restricted.
| PR 17439 (2026-05-11) |
[reliability] Release check not fail-safe
Release check not fail-safe
The nightly workflow only sets `release-in-progress=true` when the rulesets query succeeds and returns true; if `gh api` fails (e.g., auth/API error), the condition evaluates false and the output is left unset, so nightly releases proceed despite an unknown release state. This can cause the nightly pipeline to run during an active release or when the ruleset check is unavailable.The prepare job’s “Check if release ruleset is active” step only writes release-in-progress=true in the success path. If the gh api call fails, the step produces no output, and downstream logic treats it as not in progress.
needs.prepare.outputs.release-in-progress gates the nightly release job. An unset output is not equal to 'true', so the nightly release will run even when the ruleset check failed.
- .github/workflows/nightly.yml[52-61]
Update the step to be fail-safe and to always write an explicit output value.
For example:
- Call
gh apiand check its exit status. - If the API call fails, emit a warning and set
release-in-progress=true(safe default). - Otherwise set
release-in-progress=true|falsebased on the jq result.
(Any equivalent approach that distinguishes “false” from “error” and always sets the output is fine.)
[reliability] `jq any(.[])` assumes array
`jq any(.[])` assumes array
The ruleset existence check uses `jq -e any(.[]; .name == $n)` against `$existing` without ensuring `$existing` is a single JSON array, so API errors or multi-document output (e.g., from `gh api --paginate`) can cause parse/iteration failures or make `jq` decide based only on the last page, potentially missing an existing ruleset and creating a duplicate that breaks or complicates the trunk restrict/unrestrict workflow.The workflow’s ruleset existence check uses jq -e --arg n "$name" 'any(.[]; .name == $n)' <<<"$existing", which assumes $existing is a single valid JSON array. When gh api ... --paginate emits multiple JSON documents (one per page), or when the API returns unexpected/error output, this can cause parse/iteration failures or make the check effectively depend on only the last page, potentially missing an existing ruleset and creating a duplicate, disrupting trunk restrict/unrestrict behavior.
This job locks/unlocks trunk during release; failures can block releases or leave trunk in the wrong state. Missing an existing ruleset due to pagination can create duplicate rulesets with the same .name, making subsequent deletion/unlock behavior ambiguous or incomplete.
- .github/workflows/restrict-trunk.yml[61-67]
[reliability] `gh api` missing pagination
`gh api` missing pagination
The workflow captures `existing` rulesets via `gh api` without `--paginate`, which can silently omit rulesets beyond the first page. This can cause non-deterministic behavior (e.g., failing to detect or delete some release rulesets), potentially leaving trunk incorrectly restricted/unrestricted.The workflow lists existing rulesets using gh api "repos/$GH_REPO/rulesets" but does not use --paginate, so it may only read the first page of results.
The create/delete logic relies on the existing JSON to determine whether to POST new rulesets and which IDs to DELETE. If the repository has more rulesets than fit in one API page, some relevant rulesets may be missed.
- .github/workflows/restrict-trunk.yml[62-63]
- .github/workflows/restrict-trunk.yml[77-78]
[reliability] Unvalidated `$id` in delete
Unvalidated `$id` in delete
The workflow deletes rulesets by using a `$id` derived from `gh api` name matching without validating that exactly one ruleset ID was found, so empty or multiple matches can yield an invalid DELETE URL and non-deterministic failures that can block trunk unlock. This violates the requirement to harden scripts by validating preconditions and handling external/config inputs deterministically.Harden the ruleset deletion logic so it does not blindly construct and call DELETE .../rulesets/$id from gh api output; instead, it must validate that the name lookup yields a safe, deterministic set of IDs (and handle empty/multiple matches explicitly) to avoid invalid URLs and non-deterministic failures during unlock.
This is release automation: the unlock path (Delete release rulesets) runs when unlock-trunk.yml calls the workflow with restrict: false. Rulesets are created via POST in a loop and can be recreated on reruns, so duplicate names/IDs are possible; if the lookup returns empty output (not found) or multiple newline-separated IDs (duplicates), the current deletion step can fail unpredictably, produce unclear errors, and block trunk unlock. Consider also validating that the glob .github/rulesets/release-*.json matches at least one file before looping, and fail with a clear message if none are found.
- .github/workflows/restrict-trunk.yml[58-64]
- .github/workflows/restrict-trunk.yml[65-73]
- .github/workflows/unlock-trunk.yml[8-16]
[reliability] Ruleset creation duplicates
Ruleset creation duplicates
The restrict path always POSTs new rulesets from JSON files without checking whether rulesets with the same name already exist, so reruns can create duplicates and make later deletion-by-name ambiguous.The Create release rulesets step uses POST /rulesets for every JSON file every time restriction is enabled. This can create duplicate rulesets on reruns and interacts badly with name-based deletion.
The delete step selects by .name, so duplicates will either break deletion (multi-id) or leave extra rulesets behind.
- .github/workflows/restrict-trunk.yml[58-64]
- .github/workflows/restrict-trunk.yml[65-73]
Choose one:
-
Upsert approach: before POST, query existing rulesets by name; if found,
PATCH/PUTthat ID (or delete then recreate). - Delete-then-create approach: on restrict, delete any existing matching rulesets by name first, then create exactly one. In all cases, ensure the script handles 0/1/N matches deterministically.
| PR 17436 (2026-05-11) |
[reliability] Hanging stream LINQ query
Hanging stream LINQ query
NetworkTests.CanDisownData awaits FirstAsync() on an event stream without any cancellation/timeout, so if the matching ResponseCompleted event never arrives the test can hang indefinitely and stall the suite.CanDisownData awaits an async stream LINQ query ending in FirstAsync() without providing a CancellationToken or any timeout. If no matching event is produced, the test can hang indefinitely.
Other tests in this repo bound async stream waits using a CancellationTokenSource(TimeSpan...) (e.g., sub.FirstAsync(cts.Token)) or WaitAsync(TimeSpan...) to avoid CI stalls.
- dotnet/test/webdriver/BiDi/Network/NetworkTests.cs[290-304]
Wrap the FirstAsync with a timeout, e.g.:
- create
using var cts = new CancellationTokenSource(TimeSpan.FromSeconds(5)); - call
await stream.Where(...).Select(...).FirstAsync(cts.Token);(or equivalently useWaitAsync(TimeSpan.FromSeconds(5))on the returned task/value-task).
| PR 17435 (2026-05-11) |
[reliability] Pending frames leak onClose
Pending frames leak onClose
On a pre-handshake upstream `onClose`, the listener records the close and closes the `HttpClient` but intentionally keeps buffered `WebSocketFrame`s for a future `onUpgrade` drain. If the client-side upgrade never completes (e.g., client disconnects mid-handshake), those buffered ref-counted frames are never released, risking a Netty buffer leak.DirectForwardingListener keeps pre-handshake buffered WebSocketFrames on upstream onClose so they can be drained during onUpgrade. If the client-side upgrade never happens, those buffered (ref-counted) frames never get released, creating a potential memory/leak-detector issue.
The code already treats buffered frames as ref-counted (it calls frame.release() on other terminal paths like overflow), but the pre-handshake onClose path retains the buffer without any guaranteed later release.
- java/src/org/openqa/selenium/grid/router/ProxyWebsocketsIntoGrid.java[333-353]
- java/src/org/openqa/selenium/grid/router/ProxyWebsocketsIntoGrid.java[295-313]
[reliability] Pre-upgrade close not forwarded
Pre-upgrade close not forwarded
If upstream onClose/onError happens before the client-side upgrade completes, the listener sets a terminal "closed" state without retaining close details, and onUpgrade later publishes the channel without closing it—so the client WebSocket can remain open even though upstream is already gone. This can lead to downstream clients hanging until external timeouts/keepalives rather than receiving a close handshake promptly.When the upstream WebSocket closes/errors before the downstream WebSocket upgrade completes, DirectForwardingListener latches closed=true but drops the close details and onUpgrade(Channel) does not react to the terminal state. If the downstream upgrade completes later, the client channel may be published and left open without a close handshake.
- Upstream connection is initiated before the Netty handshake completes, so pre-upgrade upstream terminal events are possible.
-
closeClient()only closes the upstreamHttpClient, not the downstream Netty channel.
- Record a terminal state for pre-upgrade close/error (e.g., store
closeCode/closeReason, or store aCloseWebSocketFrameto be written post-upgrade). - In
onUpgrade(Channel ch), if a terminal state was observed (closed==true), immediately write a close frame (if applicable) and close the channel (or close without sending if protocol requires), instead of publishing the channel for normal forwarding. - Add a unit test that calls
listener.onClose(...)(oronError(...)) beforelistener.onUpgrade(channel)and asserts the channel is closed and/or aCloseWebSocketFrameis emitted.
- java/src/org/openqa/selenium/grid/router/ProxyWebsocketsIntoGrid.java[241-306]
- java/src/org/openqa/selenium/grid/router/ProxyWebsocketsIntoGrid.java[308-345]
- java/test/org/openqa/selenium/grid/router/DirectForwardingListenerTest.java[1-106]
[reliability] Unbounded frame buffer
Unbounded frame buffer
DirectForwardingListener buffers pre-upgrade frames in an unbounded deque with no cap, so if the client upgrade is delayed/stalled while upstream is producing frames, memory can grow without bound (including retaining ref-counted frame buffers). This risks router memory pressure or OOM under pathological conditions.DirectForwardingListener uses an unbounded Deque<WebSocketFrame> to buffer frames arriving before onUpgrade(Channel) is called. If the downstream upgrade stalls while upstream continues sending, the deque can grow without limit, retaining ref-counted frames and risking memory exhaustion.
- Buffering is correct for ordering, but needs a safety valve.
- Add a hard limit (by frame count and/or total bytes) for
pending. - When the limit is exceeded, choose a deterministic policy: drop newest/oldest with
release(), and/or close the downstream channel once available; also log at WARNING with session context. - Consider a time-based cutoff (e.g., if upgrade hasn’t completed within N seconds, discard pending and mark closed).
- java/src/org/openqa/selenium/grid/router/ProxyWebsocketsIntoGrid.java[220-285]
- java/src/org/openqa/selenium/grid/router/ProxyWebsocketsIntoGrid.java[331-336]
| PR 17433 (2026-05-10) |
[correctness] Mypy devtools override mismatch
Mypy devtools override mismatch
`tool.mypy.overrides.module` contains a slash-separated pattern (`selenium/webdriver/common/devtools.*`) that won’t match real Python module names, so the override won’t apply and mypy will still typecheck/flag generated DevTools modules when they exist in-tree.The mypy override intended to ignore generated DevTools modules uses a slash-separated path pattern (selenium/webdriver/common/devtools.*) which does not match Python module names, so ignore_errors will not take effect.
-
py/run_mypy.pyruns mypy against theseleniumpackage. - Generated devtools code lives under
selenium/webdriver/common/devtools/...and is imported asselenium.webdriver.common.devtools....
Update the override module pattern to dotted form, e.g. selenium.webdriver.common.devtools.* (and optionally also include selenium.webdriver.common.devtools to cover the package __init__.py).
- py/pyproject.toml[139-145]
| PR 17432 (2026-05-10) |
[security] PAT used for checkout
PAT used for checkout
The pin-browsers create-pr job checks out the repository using the write-scoped SELENIUM_CI_TOKEN even though checkout itself only needs read access, increasing the exposure surface of that secret in the job workspace/git config. This violates least-privilege and makes accidental leakage more impactful.pin-browsers.yml uses the write-scoped secrets.SELENIUM_CI_TOKEN for actions/checkout, which is broader than necessary for a read-only clone and increases secret exposure.
The same job already provides SELENIUM_CI_TOKEN to peter-evans/create-pull-request, so checkout does not need to be performed with the PAT.
- Change the checkout step to use the default
github.token(or omittoken:entirely). - Optionally set
persist-credentials: falseon checkout to avoid leaving credentials in the local git config.
- .github/workflows/pin-browsers.yml[29-32]
| PR 17426 (2026-05-08) |
[correctness] `assumeThat` not imported
`assumeThat` not imported
The newly added `assumeThat(...)` calls will not compile because there is no static import (or qualified reference) for `assumeThat`. This breaks the test build and prevents the intended Chrome-version gating from executing.assumeThat(...) is used in ContentEditableTest but is not imported or qualified, causing compilation failures and preventing the intended Chrome-version test gating.
Other tests in this repo use import static org.assertj.core.api.Assumptions.assumeThat; (or qualify the call via Assumptions.assumeThat(...)). This file currently only imports assertThat and JUnit's assumeFalse.
- java/test/org/openqa/selenium/ContentEditableTest.java[20-33]
- java/test/org/openqa/selenium/ContentEditableTest.java[101-106]
- java/test/org/openqa/selenium/ContentEditableTest.java[115-123]
| PR 17423 (2026-05-07) |
[reliability] Connected true after close
Connected true after close
The WebSocket 'open' handler unconditionally sets connected=true even if close()/_failPending() already marked the connection closed, which can leave the instance in an inconsistent state (_closed===true but isConnected===true). This can happen when close() is called during CONNECTING and the handshake completes anyway before the close event fires.The WebSocket open handler sets this.connected = true even if _closed was already set by close()/_failPending(), allowing isConnected to become true after the connection is logically closed.
close() calls _failPending() immediately (setting _closed=true), but does not remove/guard the constructor open listener. If open fires after close() is initiated, the state becomes inconsistent.
- javascript/selenium-webdriver/bidi/index.js[39-46]
- javascript/selenium-webdriver/bidi/index.js[96-113]
- javascript/selenium-webdriver/bidi/index.js[275-295]
In the open handler, early-return if this._closed is true (and optionally immediately this._ws.close()/terminate()), so connected cannot be re-enabled after closure.
[maintainability] `waitForConnection()` now rejects
`waitForConnection()` now rejects
`waitForConnection()` previously only resolved (or could hang) but now rejects when the socket is closed or closes before opening, which can break callers that do not handle promise rejection. This changes the public API contract/behavior for a user-facing interface.waitForConnection() now rejects on closed/failed connections, which is a user-visible behavioral change that may break downstream code that previously awaited it without handling rejections.
This class is exported from selenium-webdriver/bidi, so changes to promise rejection behavior can be compatibility-impacting.
- javascript/selenium-webdriver/bidi/index.js[134-155]
[reliability] close() can hang waiters
close() can hang waiters
Index.waitForConnection() depends on a WebSocket 'close'/'error' listener to reject when the socket dies before opening, but Index.close() removes all 'close' listeners before initiating ws.close(), which can remove waitForConnection()’s onFailure handler. If close() is called while a send() is awaiting waitForConnection(), that wait can remain pending indefinitely.waitForConnection() uses this._ws.once('close'|'error', ...) to reject when the socket fails before opening. But close() calls this._ws.removeAllListeners('close') before triggering this._ws.close(), which can remove waitForConnection()’s failure handler if a caller closes while another caller is awaiting connection.
This can lead to an indefinitely pending waitForConnection() (and therefore send()) promise during connection setup/cancellation scenarios.
- javascript/selenium-webdriver/bidi/index.js[133-155]
- javascript/selenium-webdriver/bidi/index.js[272-291]
- Avoid
removeAllListeners('close')before the socket actually closes. Instead:- Remove only the internal close handler(s) you own (store them as named/bound functions so they can be removed explicitly), or
- Defer
removeAllListeners()until inside theonce('close', ...)callback (after other listeners likewaitForConnection()have had a chance to run).
- Optionally, make
_failPending()also reject/resolve a tracked “connect promise” if you introduce one, soclose()can reliably unblockwaitForConnection()without relying on WS listener ordering.
[observability] Generic error on connect
Generic error on connect
waitForConnection() rejects with "BiDi connection closed before opening" even when the WebSocket emits an 'error' event before 'open', discarding the underlying error message. This makes connection failures harder to diagnose for callers awaiting waitForConnection().waitForConnection() uses a single onFailure handler for both 'close' and 'error' before the socket opens, but it always rejects with BiDi connection closed before opening, losing the real connection error detail.
This only affects the pre-open wait path (callers awaiting waitForConnection() / send() while not connected). Including the underlying error message improves diagnosability.
- javascript/selenium-webdriver/bidi/index.js[143-154]
Split the handlers:
-
onClose-> reject with the current close message -
onError(err)-> reject with something likeBiDi connection error before opening: ${err.message}Also keep the listener cleanup symmetric (remove the other listeners on success/failure).
[reliability] send() can hang forever
send() can hang forever
After a socket 'close'/'error', _failPending() sets connected=false, but send() will then await waitForConnection(), which only resolves on a future 'open' event; on a closed/failed WebSocket this can leave new send() calls pending indefinitely. This can stall higher-level BiDi helpers that call bidi.send() without checking connection state.After _failPending() runs, connected becomes false, and future send() calls can await waitForConnection() forever because there will be no future 'open' event on a closed/failed socket.
_failPending() is triggered on the underlying WebSocket 'close'/'error' events and marks the connection closed. Many BiDi helper classes call bidi.send() directly; if the connection drops and a command is attempted afterward, the call can hang indefinitely.
- javascript/selenium-webdriver/bidi/index.js[91-101]
Add a fast-fail guard in send() (or waitForConnection()):
- If
this._closedis true, immediately reject/throw a clear error. - Also consider checking
this._ws.readyStateand rejecting whenCLOSING/CLOSED. - Optionally, have
waitForConnection()reject on'close'/'error'if it is waiting for'open'.
[correctness] Invalid BiDi messages ignored
Invalid BiDi messages ignored
The new shared `message` dispatcher silently drops JSON parse failures and messages without a numeric `id`, which can turn protocol/input issues into misleading request timeouts and makes diagnosis non-actionable. Protocol-derived inputs should be validated with deterministic, actionable errors rather than being accepted/dropped silently.The shared WebSocket message handler swallows protocol/input problems (JSON parse failures or unexpected payload shapes) by returning early, which can cause callers to see unrelated timeouts and makes failures hard to diagnose.
Messages received over the BiDi WebSocket are protocol-derived external inputs. When they are malformed/unexpected, we should surface a deterministic, actionable error (e.g., emit an error event and/or reject impacted pending requests) rather than silently dropping the message.
- javascript/selenium-webdriver/bidi/index.js[45-54]
[reliability] Unexpected close stalls sends
Unexpected close stalls sends
Index only rejects pending send() calls on timeout or explicit close(); it does not handle underlying WebSocket 'close'/'error' events. If the peer disconnects mid-request, callers can hang until RESPONSE_TIMEOUT (30s) instead of failing immediately.Index tracks in-flight requests in _pending, but it does not subscribe to the underlying WebSocket 'close' / 'error' events. If the connection drops without an explicit Index.close() call, pending send() promises will only fail after RESPONSE_TIMEOUT, delaying failure propagation.
-
close()now correctly rejects_pendingwhen invoked, but unexpected disconnects still leave callers waiting up to 30 seconds. - Since there is now a single dispatcher and centralized
_pending, the most reliable place to fail outstanding requests is in a single socket-level close/error handler.
- javascript/selenium-webdriver/bidi/index.js[33-63]
- javascript/selenium-webdriver/bidi/index.js[217-241]
- Add
_ws.on('close', ...)and_ws.on('error', ...)handlers in the constructor. - In those handlers:
- set
this.connected = false - iterate
_pending.values()andclearTimeout(timeoutId)thenreject(new Error(...)) _pending.clear()
- set
- Ensure handlers are idempotent (e.g., guard with a boolean like
this._closed = true) soclose()and'close'event don’t double-reject. - (Optional) extend tests to simulate server-side disconnect and assert pending sends reject promptly (without waiting for the timeout).
| PR 17421 (2026-05-07) |
[reliability] Missing artifact download perms
Missing artifact download perms
The publish-python job sets job-level permissions to only `id-token: write`, overriding the workflow default and stripping all GITHUB_TOKEN scopes. As a result, `actions/download-artifact@v4` may be unable to download `pypi-distributions`, causing the PyPI publish step to fail.publish-python sets permissions to only id-token: write, which overrides the workflow defaults and leaves the job’s GITHUB_TOKEN without the scopes that actions/download-artifact@v4 may require to fetch artifacts. This can break Python publishing because the pypi-distributions artifact may not download.
This workflow already uses a broader permission set in other jobs that perform authenticated GitHub API operations (including artifact downloads).
- .github/workflows/release.yml[96-110]
Update publish-python permissions to include the minimal required scopes for artifact download, e.g.:
permissions:
id-token: write
contents: read
actions: read(Keep id-token: write for PyPI trusted publishing.)
This wiki is not where you want to be! Visit the Wiki Home for more useful links
Getting Involved
Triaging Issues
Releasing Selenium
Ruby Development
Python Bindings
Ruby Bindings
WebDriverJs
This content is being evaluated for where it belongs
Architectural Overview
Automation Atoms
HtmlUnitDriver
Lift Style API
LoadableComponent
Logging
PageFactory
RemoteWebDriver
Xpath In WebDriver
Moved to Official Documentation
Bot Style Tests
Buck
Continuous Integration
Crazy Fun Build
Design Patterns
Desired Capabilities
Developer Tips
Domain Driven Design
Firefox Driver
Firefox Driver Internals
Focus Stealing On Linux
Frequently Asked Questions
Google Summer Of Code
Grid Platforms
History
Internet Explorer Driver
InternetExplorerDriver Internals
Next Steps
PageObjects
RemoteWebDriverServer
Roadmap
Scaling WebDriver
SeIDE Release Notes
Selenium Emulation
Selenium Grid 4
Selenium Help
Shipping Selenium 3
The Team
TLC Meetings
Untrusted SSL Certificates
WebDriver For Mobile Browsers
Writing New Drivers