Aquileo | [Docs] [Serve] Improve serve autoscaling docs by marwan116 · Pull Request #59475 · ray-project/ray

marwan116 · 2025-12-16T15:26:43Z

Description

Slight improvements to the Ray Serve custom autoscaling guide.

To ease review, each change was performed in a separate commit. Read commit titles for an overview.

gemini-code-assist

Code Review

This pull request improves the Ray Serve autoscaling documentation by clarifying the purpose of AutoscalingContext, adding file name headers to code examples for better readability, and updating an example to use psutil for more realistic custom metrics. The changes are clear and enhance the quality of the documentation.

I've found one issue in an example code snippet for a custom autoscaling policy, where the logic to find the maximum metric value is incorrect. I've left a comment with a suggestion to fix it.

gemini-code-assist · 2025-12-16T15:28:34Z

 ```{literalinclude} ../doc_code/autoscaling_policy.py
 :language: python
 :start-after: __begin_custom_metrics_autoscaling_policy__
 :end-before: __end_custom_metrics_autoscaling_policy__
 ```


The custom_metrics_autoscaling_policy example included here contains incorrect logic for determining the maximum CPU and memory usage. It uses list(some_dict.values())[-1], which does not guarantee returning the maximum value because dictionary values are not ordered by value. This could mislead users who copy this example, leading to incorrect autoscaling behavior.

To fix this, the max() function should be used. Please update the corresponding file doc/source/serve/doc_code/autoscaling_policy.py to correct this.

Here is the corrected code for custom_metrics_autoscaling_policy:

def custom_metrics_autoscaling_policy( ctx: AutoscalingContext, ) -> tuple[int, Dict[str, Any]]: cpu_usage_metric = ctx.aggregated_metrics.get("cpu_usage", {}) memory_usage_metric = ctx.aggregated_metrics.get("memory_usage", {}) max_cpu_usage = max(cpu_usage_metric.values()) if cpu_usage_metric else 0 max_memory_usage = ( max(memory_usage_metric.values()) if memory_usage_metric else 0 ) if max_cpu_usage > 80 or max_memory_usage > 85: return min(ctx.capacity_adjusted_max_replicas, ctx.current_num_replicas + 1), {} elif max_cpu_usage < 30 and max_memory_usage < 40: return max(ctx.capacity_adjusted_min_replicas, ctx.current_num_replicas - 1), {} else: return ctx.current_num_replicas, {}

Signed-off-by: Marwan Sarieddine <sarieddine.marwan@gmail.com>

cursor · 2025-12-16T15:37:09Z

+        # Get memory usage as a percentage of system memory
+        memory_info = self.process.memory_full_info()
+        system_memory = psutil.virtual_memory().total
+        memory_usage = (memory_info.uss / system_memory) * 100


Bug: Documentation example may fail in containerized environments

The memory_full_info() method can raise psutil.AccessDenied on Linux systems where the process lacks permission to read /proc/[pid]/smaps, which is common in containerized environments where Ray Serve is frequently deployed. This could cause the documentation example to fail when users copy it. The more portable memory_info() method with rss attribute would work reliably across all environments.

abrarsheikh · 2025-12-16T22:05:14Z

@marwan116 please take a look at failing build

marwan116 · 2025-12-17T06:53:19Z

@marwan116 please take a look at failing build

@abrarsheikh they all look like flaky failures due to a Gateway timeout to me - let me know if I am missing anything. Unfortunately I don't have permissions to retry/ start a rebuild.

abrarsheikh · 2025-12-17T07:22:21Z

@marwan116 please take a look at failing build

@abrarsheikh they all look like flaky failures due to a Gateway timeout to me - let me know if I am missing anything. Unfortunately I don't have permissions to retry/ start a rebuild.

I suggest pulling master into your branch and do a git push

…caling-docs

marwan116 requested review from a team as code owners December 16, 2025 15:26

gemini-code-assist Bot reviewed Dec 16, 2025

View reviewed changes

marwan116 force-pushed the improve-serve-autoscaling-docs branch from 4e42474 to 7828bdb Compare December 16, 2025 15:29

marwan116 added 5 commits December 16, 2025 17:33

Remove irrelevant kwargs for scheduled_batch_processing_policy

23f5943

Signed-off-by: Marwan Sarieddine <sarieddine.marwan@gmail.com>

Clarify the points are about AutoscalingContext

f8e2c6a

Signed-off-by: Marwan Sarieddine <sarieddine.marwan@gmail.com>

Improve type annotation

0d5b499

Signed-off-by: Marwan Sarieddine <sarieddine.marwan@gmail.com>

Add file names prior to code snippets

6fb9539

Signed-off-by: Marwan Sarieddine <sarieddine.marwan@gmail.com>

Make CustomMetricsDeployment example more realistic

bc230a2

Signed-off-by: Marwan Sarieddine <sarieddine.marwan@gmail.com>

marwan116 force-pushed the improve-serve-autoscaling-docs branch from 7828bdb to bc230a2 Compare December 16, 2025 15:34

cursor Bot reviewed Dec 16, 2025

View reviewed changes

marwan116 requested review from abrarsheikh and akshay-anyscale December 16, 2025 16:13

abrarsheikh added the go add ONLY when ready to merge, run all tests label Dec 16, 2025

ray-gardener Bot added the serve Ray Serve Related Issue label Dec 16, 2025

marwan116 added 2 commits December 17, 2025 09:31

Merge remote-tracking branch 'origin/master' into improve-serve-autos…

b57b5b3

…caling-docs

Merge remote-tracking branch 'origin/master' into improve-serve-autos…

cbffdd7

…caling-docs

abrarsheikh approved these changes Dec 17, 2025

View reviewed changes

abrarsheikh merged commit 17bbf59 into master Dec 17, 2025
6 checks passed

abrarsheikh deleted the improve-serve-autoscaling-docs branch December 17, 2025 19:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Docs] [Serve] Improve serve autoscaling docs#59475

[Docs] [Serve] Improve serve autoscaling docs#59475
abrarsheikh merged 7 commits into
masterfrom
improve-serve-autoscaling-docs

marwan116 commented Dec 16, 2025 •
edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Dec 16, 2025

Uh oh!

cursor Bot Dec 16, 2025

Uh oh!

abrarsheikh commented Dec 16, 2025

Uh oh!

marwan116 commented Dec 17, 2025

Uh oh!

abrarsheikh commented Dec 17, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

marwan116 commented Dec 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

cursor Bot Dec 16, 2025

Choose a reason for hiding this comment

Bug: Documentation example may fail in containerized environments

Uh oh!

abrarsheikh commented Dec 16, 2025

Uh oh!

marwan116 commented Dec 17, 2025

Uh oh!

abrarsheikh commented Dec 17, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

marwan116 commented Dec 16, 2025 •
edited

Loading