Current Behavior
When I add a consumer (e.g. georg) with an ai-rate-limitin configuration as shown in the example, and then create a second consumer (e.g. martin) with the same plugin configuration but a different key-auth API key, both consumers end up sharing the same rate limit for an instance.
Once georg reaches the configured token limit, request from martin are also rejected with 429 "Configured rate limit reached", even though martin has his own consumer entry and API key.
The rate limitin appears to be applied globally per model, rather than per consumer, as described in the documentation.
{
"username":"georg",
"plugins":{
"key-auth":{
"key":"Bearer "
},
"ai-rate-limiting":{
"instances":[
{
"name":"gpt-oss-120b",
"limit_strategy":"prompt_tokens",
"time_window":60,
"limit":400
},
{
"name":"bge-m3",
"limit_strategy":"prompt_tokens",
"time_window":60,
"limit":10000
},
{
"name":"gpt-oss-120b",
"limit_strategy":"completion_tokens",
"time_window":60,
"limit":400
}
],
"rejected_code":429,
"rejected_msg":"Configured rate limit reached",
"show_limit_quota_header":true
}
}
}
Expected Behavior
Each consumer should have an independent rate limit quota.
When multiple consumers (e.g. georg and marting) are configured with the same ai-rate-limiting plugin settings but different key-auth API keys, the token limit should be enforced per consumer, not shared globally.
If georg reaches his configured token limit, only request from georg should be rejected with 429 ..., while martin should still be able to make requests within his own quota.
Error Logs
No response
Steps to Reproduce
- Create a Consumer named georg with key-auth enabled and configure the ai-rate-limiting plugin with a token limit (e.g. 400 tokens per 60 seconds for an instance).
- Create a second Consumer named martin with the same ai-rate-limiting configuration, but with a different key-auth API key.
- Send requests using georg’s API key until the configured token limit is reached and requests start returning: "429 Configured rate limit reached"
- Immediately send requests using martin’s API key.
- Observe that martin’s requests are also rejected with: "429 Configured rate limit reached"
Environment
-
APISIX version (run apisix version):
3.14.1
-
Operating system (run uname -a):
Linux (Kubernetes container, official APISIX Docker image)
-
OpenResty / Nginx version (run openresty -V or nginx -V):
OpenResty (bundled with APISIX Docker image)
-
etcd version, if relevant (run curl http://127.0.0.1:9090/v1/server_info):
v3.6.0 (self-deployed in Kubernetes)
-
APISIX Dashboard version, if relevant:
Not used
-
Plugin runner version, for issues related to plugin runners:
Not used (using a serverless-post-function)
-
LuaRocks version, for installation issues (run luarocks --version):
Not applicable (using official Docker image)
Current Behavior
When I add a consumer (e.g. georg) with an ai-rate-limitin configuration as shown in the example, and then create a second consumer (e.g. martin) with the same plugin configuration but a different key-auth API key, both consumers end up sharing the same rate limit for an instance.
Once georg reaches the configured token limit, request from martin are also rejected with 429 "Configured rate limit reached", even though martin has his own consumer entry and API key.
The rate limitin appears to be applied globally per model, rather than per consumer, as described in the documentation.
Expected Behavior
Each consumer should have an independent rate limit quota.
When multiple consumers (e.g. georg and marting) are configured with the same ai-rate-limiting plugin settings but different key-auth API keys, the token limit should be enforced per consumer, not shared globally.
If georg reaches his configured token limit, only request from georg should be rejected with 429 ..., while martin should still be able to make requests within his own quota.
Error Logs
No response
Steps to Reproduce
Environment
APISIX version (run
apisix version):3.14.1
Operating system (run
uname -a):Linux (Kubernetes container, official APISIX Docker image)
OpenResty / Nginx version (run
openresty -Vornginx -V):OpenResty (bundled with APISIX Docker image)
etcd version, if relevant (run
curl http://127.0.0.1:9090/v1/server_info):v3.6.0 (self-deployed in Kubernetes)
APISIX Dashboard version, if relevant:
Not used
Plugin runner version, for issues related to plugin runners:
Not used (using a serverless-post-function)
LuaRocks version, for installation issues (run
luarocks --version):Not applicable (using official Docker image)