Upgrade vLLM to 0.17.0#61598
Conversation
There was a problem hiding this comment.
Code Review
This pull request correctly upgrades vLLM to version 0.17.0 and updates its dependencies accordingly. The code changes are consistent with this upgrade. However, it seems a local configuration for a PyPI index has been accidentally included in many of the dependency lock files. This should be removed to avoid breaking builds.
fb6c09d to
56b154c
Compare
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
56b154c to
2b1741a
Compare
| opentelemetry-proto==1.39.0 \ | ||
| --hash=sha256:1e086552ac79acb501485ff0ce75533f70f3382d43d0a30728eeee594f7bf818 \ | ||
| --hash=sha256:c1fa48678ad1a1624258698e59be73f990b7fc1f39e73e16a9d08eef65dd838c | ||
| opentelemetry-proto==1.34.1 \ |
There was a problem hiding this comment.
I think this is compiled from https://github.com/ray-project/ray/blob/15a473454084a739264ce66290d7d4fc1b3926b4/python/requirements/serve/tracing-reqs.txt + all opentelemetry libraries should have the same version.
There was a problem hiding this comment.
hmm... that does not really make sense to me.
- how is the
tracing-reqs.txtget pulled in with this change? - should that be upgraded to 1.39.0 for consistency?
I think it is that some additional dependency of vllm 0.17 is pulling the version down
There was a problem hiding this comment.
@elliot-barn could you help investigate? like what will happen if we enforce opentelemetry-proto>=1.39.0 as a constraint?
There was a problem hiding this comment.
opentelemetry-proto 1.40.0 depends on protobuf<7.0 and >=5.0
opentelemetry-proto 1.39.1 depends on protobuf<7.0 and >=5.0
opentelemetry-proto 1.39.0 depends on protobuf<7.0 and >=5.0
we are still on 4.25.8 and py313 dependency upgrade initiative will bring us to 5.29.6
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
aslonnie
left a comment
There was a problem hiding this comment.
I would like to understand more on why opentelemetry-proto needs to be downgraded.
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
1. Restore `--index https://download.pytorch.org/whl/${CUDA_CODE}` in rayllm.depsets.yaml. This was accidentally dropped, causing cu128 lockfiles to resolve torchaudio from PyPI instead of the CUDA index. 2. Add numexpr>=2.10 to llm-test-requirements.txt. The CI base Docker image has numexpr compiled against NumPy 1.x, but the lockfile installs NumPy 2.x, causing a binary incompatibility crash. Including numexpr in the lockfile ensures a compatible version overwrites the base image's broken one. 3. Regenerate all 16 LLM lockfiles. Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
Master also lacks numexpr in lockfiles with the same NumPy 2.2.6. The CPU test numexpr failure is a base image issue, not caused by this branch. Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
518bd4c to
d9d7a78
Compare
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
68b5b08 to
9a06636
Compare
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
| # exact versions, so integrity is maintained through version pinning. | ||
| uv pip install --system --no-cache-dir --no-deps \ | ||
| --index-strategy unsafe-best-match \ | ||
| --no-verify-hashes \ |
There was a problem hiding this comment.
Disabling hash verification weakens supply chain security
Medium Severity
Adding --no-verify-hashes disables integrity checking for all packages installed from the lock file. The lock files still contain hashes, but they are completely ignored during installation. This means a compromised or tampered package on the CUDA index (or any alternate index used via unsafe-best-match) could be installed without detection. While version pinning provides some defense, hash verification is the primary protection against supply chain attacks where an index serves a modified binary for a pinned version. A more targeted fix — such as regenerating hashes from the actual CUDA index, or excluding only the mismatched packages — would preserve integrity checking for the majority of dependencies.
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
There are 3 total unresolved issues (including 2 from previous reviews).
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.
| # exact versions, so integrity is maintained through version pinning. | ||
| uv pip install --system --no-cache-dir --no-deps \ | ||
| --index-strategy unsafe-best-match \ | ||
| --no-verify-hashes \ |
There was a problem hiding this comment.
Missing LD_LIBRARY_PATH fix in production ray-llm Dockerfile
High Severity
The production docker/ray-llm/Dockerfile upgrades to vLLM 0.17.0 but is missing the LD_LIBRARY_PATH=/home/ray/anaconda3/lib environment variable that was added to ci/docker/llm.build.Dockerfile and release/ray_release/byod/byod.Dockerfile. The PR description explains that vLLM 0.17.0 eagerly imports xgrammar, triggering a libstdc++ dependency chain requiring CXXABI_1.3.15 from conda's copy. Without this fix, the production image will fail at runtime on import vllm.
Additional Locations (1)
Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>
Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>
## Description Upgrade to vLLM 0.17.0 and re-compile dependencies. Here are the primary changes aside from fixing breaking APIs: ### 1. `no-verify-hashes` at `pip install` time `uv pip compile` generates lock files with hashes from PyPI, but `uv pip install --index-strategy unsafe-best-match` may download the same package (specifically `triton`) from the PyTorch CUDA index instead, which serves a different wheel build with a different hash. The workaround is `--no-verify-hashes`, which disables all hash integrity checking. #### Alternative approach: Augment hashes after compilation - At lock file compile time, after uv pip compile generates hashes from its resolved index (PyPI), a post-processing step queries every `--extra-index-url` (e.g. PyTorch CUDA index) for each package and appends any additional SHA-256 hashes to the lock file. - At Docker install time, `--verify-hashes` works again because the lock file now contains hashes from all indexes that `--index-strategy unsafe-best-match` might download from. - This is more complicated. ### 2. libstdc++ / CXXABI / ICU compatibility In vLLM 0.16.0, XgrammarBackend was imported lazily inside a method (grammar_init), so the `diskcache` → `sqlite3` → `ICU` → `libstdc++` chain was only triggered when actually using structured output. In vLLM 0.17.0, XgrammarBackend is imported at the top level of `__init__.py`, meaning import vllm now eagerly loads the entire chain: `__init__.py` → `backend_xgrammar.py` → `utils.py` → diskcache → sqlite3 → `_sqlite3.so` → `libicui18n.so.78` → `libstdc++` (CXXABI_1.3.15). The system `libstdc++` only goes up to CXXABI_1.3.13, but conda's copy (`libstdc++.so.6.0.34`) has CXXABI_1.3.15. Conda installs its own C++ ecosystem (libstdc++, ICU, etc.) but at runtime the dynamic linker finds the system libstdc++ first. The fix is to make sure conda's newer `libstdc++` is found first by setting `LD_LIBRARY_PATH`. ## Related issues > Link related issues: "Fixes ray-project#1234", "Closes ray-project#1234", or "Related to ray-project#1234". ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. --------- Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com> Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com> Co-authored-by: Lonnie Liu <95255098+aslonnie@users.noreply.github.com> Co-authored-by: elliot-barn <elliot.barnwell@anyscale.com> Signed-off-by: 熠欣 <limoxuan.lmx@alibaba-inc.com>
## Description Fix stale open-telemetry hashes in llm py312 lock files missed by #61598. ## Related issues Post-merge failures: https://buildkite.com/ray-project/postmerge/builds/16555/steps/canvas?sid=019d047c-934a-44b9-a9bd-89756bcdc297&tab=output. ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. Testing Built locally with wanda: **export PYTHON=3.12 && export BASE_TYPE=build && export BUILD_VARIANT=build && export RAY_CUDA_CODE=cpu && wanda ci/docker/llm.build.wanda.yaml** <details> <summary>Click to see logs</summary> ``` 2026/03/19 18:01:32 building oss-ci-base_test-py3.12 (from ci/docker/base.test.wanda.yaml) 2026/03/19 18:01:41 build input digest: sha256:18a8af74139d665a6e758968bfbde633d6378f53fc93c9e43f44d5e44079d5dd 2026/03/19 18:01:41 cache hit: sha256:5a4947e0886491051e825e5818560be5424df55fb702b0490fd80a9f3df69e42 2026/03/19 18:01:41 tag output as cr.ray.io/rayproject/oss-ci-base_test-py3.12 2026/03/19 18:01:41 tag output as localhost:5000/rayci-work:oss-ci-base_test-py3.12 2026/03/19 18:01:41 tag output as localhost:5000/rayci-work:z-18a8af74139d665a6e758968bfbde633d6378f53fc93c9e43f44d5e44079d5dd 2026/03/19 18:01:41 building oss-ci-base_build-py3.12 (from ci/docker/base.build.wanda.yaml) 2026/03/19 18:01:41 build input digest: sha256:350b089d9da1200e6ec5c1ab39401339692fec2edee96f5b1710df664e95830d 2026/03/19 18:01:41 cache hit: sha256:912c6d3aeabd7af2a8aaaa69ae4fae1b6885be3ba89d324c5fdd9802e90e13e9 2026/03/19 18:01:41 tag output as cr.ray.io/rayproject/oss-ci-base_build-py3.12 2026/03/19 18:01:41 tag output as localhost:5000/rayci-work:oss-ci-base_build-py3.12 2026/03/19 18:01:41 tag output as localhost:5000/rayci-work:z-350b089d9da1200e6ec5c1ab39401339692fec2edee96f5b1710df664e95830d 2026/03/19 18:01:41 building llmbuild (from /home/ubuntu/repos/ray/ci/docker/llm.build.wanda.yaml) 2026/03/19 18:01:41 build input digest: sha256:90b7bd4eb00b5344b3ff80a3996023a33183a922639b86e91336c95c1aaf11f3 2026/03/19 18:01:41 cache hit: sha256:75c255fb274c3b7e373735eb20834391f65a20308b53e26e052c17de7920ed5c 2026/03/19 18:01:41 tag output as cr.ray.io/rayproject/llmbuild 2026/03/19 18:01:41 tag output as localhost:5000/rayci-work:llmbuild 2026/03/19 18:01:42 tag output as localhost:5000/rayci-work:z-90b7bd4eb00b5344b3ff80a3996023a33183a922639b86e91336c95c1aaf11f3 ``` </details> Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
## Description Upgrade to vLLM 0.17.0 and re-compile dependencies. Here are the primary changes aside from fixing breaking APIs: ### 1. `no-verify-hashes` at `pip install` time `uv pip compile` generates lock files with hashes from PyPI, but `uv pip install --index-strategy unsafe-best-match` may download the same package (specifically `triton`) from the PyTorch CUDA index instead, which serves a different wheel build with a different hash. The workaround is `--no-verify-hashes`, which disables all hash integrity checking. #### Alternative approach: Augment hashes after compilation - At lock file compile time, after uv pip compile generates hashes from its resolved index (PyPI), a post-processing step queries every `--extra-index-url` (e.g. PyTorch CUDA index) for each package and appends any additional SHA-256 hashes to the lock file. - At Docker install time, `--verify-hashes` works again because the lock file now contains hashes from all indexes that `--index-strategy unsafe-best-match` might download from. - This is more complicated. ### 2. libstdc++ / CXXABI / ICU compatibility In vLLM 0.16.0, XgrammarBackend was imported lazily inside a method (grammar_init), so the `diskcache` → `sqlite3` → `ICU` → `libstdc++` chain was only triggered when actually using structured output. In vLLM 0.17.0, XgrammarBackend is imported at the top level of `__init__.py`, meaning import vllm now eagerly loads the entire chain: `__init__.py` → `backend_xgrammar.py` → `utils.py` → diskcache → sqlite3 → `_sqlite3.so` → `libicui18n.so.78` → `libstdc++` (CXXABI_1.3.15). The system `libstdc++` only goes up to CXXABI_1.3.13, but conda's copy (`libstdc++.so.6.0.34`) has CXXABI_1.3.15. Conda installs its own C++ ecosystem (libstdc++, ICU, etc.) but at runtime the dynamic linker finds the system libstdc++ first. The fix is to make sure conda's newer `libstdc++` is found first by setting `LD_LIBRARY_PATH`. ## Related issues > Link related issues: "Fixes ray-project#1234", "Closes ray-project#1234", or "Related to ray-project#1234". ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. --------- Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com> Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com> Co-authored-by: Lonnie Liu <95255098+aslonnie@users.noreply.github.com> Co-authored-by: elliot-barn <elliot.barnwell@anyscale.com>
…roject#61882) ## Description Fix stale open-telemetry hashes in llm py312 lock files missed by ray-project#61598. ## Related issues Post-merge failures: https://buildkite.com/ray-project/postmerge/builds/16555/steps/canvas?sid=019d047c-934a-44b9-a9bd-89756bcdc297&tab=output. ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. Testing Built locally with wanda: **export PYTHON=3.12 && export BASE_TYPE=build && export BUILD_VARIANT=build && export RAY_CUDA_CODE=cpu && wanda ci/docker/llm.build.wanda.yaml** <details> <summary>Click to see logs</summary> ``` 2026/03/19 18:01:32 building oss-ci-base_test-py3.12 (from ci/docker/base.test.wanda.yaml) 2026/03/19 18:01:41 build input digest: sha256:18a8af74139d665a6e758968bfbde633d6378f53fc93c9e43f44d5e44079d5dd 2026/03/19 18:01:41 cache hit: sha256:5a4947e0886491051e825e5818560be5424df55fb702b0490fd80a9f3df69e42 2026/03/19 18:01:41 tag output as cr.ray.io/rayproject/oss-ci-base_test-py3.12 2026/03/19 18:01:41 tag output as localhost:5000/rayci-work:oss-ci-base_test-py3.12 2026/03/19 18:01:41 tag output as localhost:5000/rayci-work:z-18a8af74139d665a6e758968bfbde633d6378f53fc93c9e43f44d5e44079d5dd 2026/03/19 18:01:41 building oss-ci-base_build-py3.12 (from ci/docker/base.build.wanda.yaml) 2026/03/19 18:01:41 build input digest: sha256:350b089d9da1200e6ec5c1ab39401339692fec2edee96f5b1710df664e95830d 2026/03/19 18:01:41 cache hit: sha256:912c6d3aeabd7af2a8aaaa69ae4fae1b6885be3ba89d324c5fdd9802e90e13e9 2026/03/19 18:01:41 tag output as cr.ray.io/rayproject/oss-ci-base_build-py3.12 2026/03/19 18:01:41 tag output as localhost:5000/rayci-work:oss-ci-base_build-py3.12 2026/03/19 18:01:41 tag output as localhost:5000/rayci-work:z-350b089d9da1200e6ec5c1ab39401339692fec2edee96f5b1710df664e95830d 2026/03/19 18:01:41 building llmbuild (from /home/ubuntu/repos/ray/ci/docker/llm.build.wanda.yaml) 2026/03/19 18:01:41 build input digest: sha256:90b7bd4eb00b5344b3ff80a3996023a33183a922639b86e91336c95c1aaf11f3 2026/03/19 18:01:41 cache hit: sha256:75c255fb274c3b7e373735eb20834391f65a20308b53e26e052c17de7920ed5c 2026/03/19 18:01:41 tag output as cr.ray.io/rayproject/llmbuild 2026/03/19 18:01:41 tag output as localhost:5000/rayci-work:llmbuild 2026/03/19 18:01:42 tag output as localhost:5000/rayci-work:z-90b7bd4eb00b5344b3ff80a3996023a33183a922639b86e91336c95c1aaf11f3 ``` </details> Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
…ay-project#61929) We shouldn't hijack `LD_LIBRARY_PATH` in general Ray release images introduced by ray-project#61598. Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
## Description Upgrade to vLLM 0.17.0 and re-compile dependencies. Here are the primary changes aside from fixing breaking APIs: ### 1. `no-verify-hashes` at `pip install` time `uv pip compile` generates lock files with hashes from PyPI, but `uv pip install --index-strategy unsafe-best-match` may download the same package (specifically `triton`) from the PyTorch CUDA index instead, which serves a different wheel build with a different hash. The workaround is `--no-verify-hashes`, which disables all hash integrity checking. #### Alternative approach: Augment hashes after compilation - At lock file compile time, after uv pip compile generates hashes from its resolved index (PyPI), a post-processing step queries every `--extra-index-url` (e.g. PyTorch CUDA index) for each package and appends any additional SHA-256 hashes to the lock file. - At Docker install time, `--verify-hashes` works again because the lock file now contains hashes from all indexes that `--index-strategy unsafe-best-match` might download from. - This is more complicated. ### 2. libstdc++ / CXXABI / ICU compatibility In vLLM 0.16.0, XgrammarBackend was imported lazily inside a method (grammar_init), so the `diskcache` → `sqlite3` → `ICU` → `libstdc++` chain was only triggered when actually using structured output. In vLLM 0.17.0, XgrammarBackend is imported at the top level of `__init__.py`, meaning import vllm now eagerly loads the entire chain: `__init__.py` → `backend_xgrammar.py` → `utils.py` → diskcache → sqlite3 → `_sqlite3.so` → `libicui18n.so.78` → `libstdc++` (CXXABI_1.3.15). The system `libstdc++` only goes up to CXXABI_1.3.13, but conda's copy (`libstdc++.so.6.0.34`) has CXXABI_1.3.15. Conda installs its own C++ ecosystem (libstdc++, ICU, etc.) but at runtime the dynamic linker finds the system libstdc++ first. The fix is to make sure conda's newer `libstdc++` is found first by setting `LD_LIBRARY_PATH`. ## Related issues > Link related issues: "Fixes ray-project#1234", "Closes ray-project#1234", or "Related to ray-project#1234". ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. --------- Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com> Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com> Co-authored-by: Lonnie Liu <95255098+aslonnie@users.noreply.github.com> Co-authored-by: elliot-barn <elliot.barnwell@anyscale.com>
…roject#61882) ## Description Fix stale open-telemetry hashes in llm py312 lock files missed by ray-project#61598. ## Related issues Post-merge failures: https://buildkite.com/ray-project/postmerge/builds/16555/steps/canvas?sid=019d047c-934a-44b9-a9bd-89756bcdc297&tab=output. ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. Testing Built locally with wanda: **export PYTHON=3.12 && export BASE_TYPE=build && export BUILD_VARIANT=build && export RAY_CUDA_CODE=cpu && wanda ci/docker/llm.build.wanda.yaml** <details> <summary>Click to see logs</summary> ``` 2026/03/19 18:01:32 building oss-ci-base_test-py3.12 (from ci/docker/base.test.wanda.yaml) 2026/03/19 18:01:41 build input digest: sha256:18a8af74139d665a6e758968bfbde633d6378f53fc93c9e43f44d5e44079d5dd 2026/03/19 18:01:41 cache hit: sha256:5a4947e0886491051e825e5818560be5424df55fb702b0490fd80a9f3df69e42 2026/03/19 18:01:41 tag output as cr.ray.io/rayproject/oss-ci-base_test-py3.12 2026/03/19 18:01:41 tag output as localhost:5000/rayci-work:oss-ci-base_test-py3.12 2026/03/19 18:01:41 tag output as localhost:5000/rayci-work:z-18a8af74139d665a6e758968bfbde633d6378f53fc93c9e43f44d5e44079d5dd 2026/03/19 18:01:41 building oss-ci-base_build-py3.12 (from ci/docker/base.build.wanda.yaml) 2026/03/19 18:01:41 build input digest: sha256:350b089d9da1200e6ec5c1ab39401339692fec2edee96f5b1710df664e95830d 2026/03/19 18:01:41 cache hit: sha256:912c6d3aeabd7af2a8aaaa69ae4fae1b6885be3ba89d324c5fdd9802e90e13e9 2026/03/19 18:01:41 tag output as cr.ray.io/rayproject/oss-ci-base_build-py3.12 2026/03/19 18:01:41 tag output as localhost:5000/rayci-work:oss-ci-base_build-py3.12 2026/03/19 18:01:41 tag output as localhost:5000/rayci-work:z-350b089d9da1200e6ec5c1ab39401339692fec2edee96f5b1710df664e95830d 2026/03/19 18:01:41 building llmbuild (from /home/ubuntu/repos/ray/ci/docker/llm.build.wanda.yaml) 2026/03/19 18:01:41 build input digest: sha256:90b7bd4eb00b5344b3ff80a3996023a33183a922639b86e91336c95c1aaf11f3 2026/03/19 18:01:41 cache hit: sha256:75c255fb274c3b7e373735eb20834391f65a20308b53e26e052c17de7920ed5c 2026/03/19 18:01:41 tag output as cr.ray.io/rayproject/llmbuild 2026/03/19 18:01:41 tag output as localhost:5000/rayci-work:llmbuild 2026/03/19 18:01:42 tag output as localhost:5000/rayci-work:z-90b7bd4eb00b5344b3ff80a3996023a33183a922639b86e91336c95c1aaf11f3 ``` </details> Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
…ay-project#61929) We shouldn't hijack `LD_LIBRARY_PATH` in general Ray release images introduced by ray-project#61598. Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>


Description
Upgrade to vLLM 0.17.0 and re-compile dependencies. Here are the primary changes aside from fixing breaking APIs:
1.
no-verify-hashesatpip installtimeuv pip compilegenerates lock files with hashes from PyPI, butuv pip install --index-strategy unsafe-best-matchmay download the same package (specificallytriton) from the PyTorch CUDA index instead, which serves a different wheel build with a different hash. The workaround is--no-verify-hashes, which disables all hash integrity checking.Alternative approach: Augment hashes after compilation
--extra-index-url(e.g. PyTorch CUDA index) for each package and appends any additional SHA-256 hashes to the lock file.--verify-hashesworks again because the lock file now contains hashes from all indexes that--index-strategy unsafe-best-matchmight download from.2. libstdc++ / CXXABI / ICU compatibility
In vLLM 0.16.0, XgrammarBackend was imported lazily inside a method (grammar_init), so the
diskcache→sqlite3→ICU→libstdc++chain was only triggered when actually using structured output.In vLLM 0.17.0, XgrammarBackend is imported at the top level of
__init__.py, meaning import vllm now eagerly loads the entire chain:__init__.py→backend_xgrammar.py→utils.py→ diskcache → sqlite3 →_sqlite3.so→libicui18n.so.78→libstdc++(CXXABI_1.3.15).The system
libstdc++only goes up to CXXABI_1.3.13, but conda's copy (libstdc++.so.6.0.34) has CXXABI_1.3.15. Conda installs its own C++ ecosystem (libstdc++, ICU, etc.) but at runtime the dynamic linker finds the system libstdc++ first. The fix is to make sure conda's newerlibstdc++is found first by settingLD_LIBRARY_PATH.Related issues
Additional information