RateLimit policy

Phase

onRequest	onResponse
X

Description

There are four rate-limit policies:

Quota: configures the number of requests allowed over a period of time (hours, days, weeks, months)
Rate-Limit: configures the number of requests allowed over a limited period of time (seconds, minutes)
Spike-Arrest: throttles the number of requests processed and sends them to the backend to avoid a spike
Token-Bucket: a steady refill rate with room to burst up to a set capacity

Compatibility with APIM

Plugin version	APIM version
1.x	Up to 3.19
2.x	3.20 to 4.5
3.x	4.6 to 4.8
4.x	4.9 to 4.11
5.x	4.12 to latest

Configuration

You can configure the policies with the following options:

Quota

The Quota policy configures the number of requests allowed over a large period of time (from hours to months). This policy does not prevent request spikes.

Property	Required	Description	Type	Default
Failover passthrough	No	In case of failure, the request is forwarded without quota.	Boolean	false
Non-strict mode (async)	No	By activating this option, quota is applied in an asynchronous way meaning that the distributed counter value is not strict (backend can receive more queries than configured).	Boolean	false
key	No	Key to identify a consumer to apply the quota against. Leave it empty to apply the default behavior (plan/subscription pair). Supports Expression Language.	String	null
limit	No	Static limit on the number of requests that can be sent (this limit is used if the value > 0).	integer	0
dynamicLimit	No	Dynamic limit on the number of requests that can be sent (this limit is used if static limit = 0). The dynamic value is based on Expression Language expressions.	string	null
periodTime	Yes	Static Time duration. This value must be empty to use the `dynamicPeriodTime`.	Integer	1
periodTimeUnit	Yes	Time unit (`HOURS`, `DAYS`, `WEEKS`, `MONTHS`)	String	MONTHS
dynamicPeriodTime	No	This expression will be used when `periodTime` is empty. Supports EL. For The default time unit is `HOURS`.	String	null

Configuration example

Static Time Duration

{
  "quota": {
    "limit": "1000",
    "periodTime": 1,
    "periodTimeUnit": "MONTHS"
  }
}

Dynamic Time Duration

{
  "quota": {
    "limit": "1000",
    "dynamicPeriodTime": "{#context.attributes['rate-limit-quota-period']}",
    "periodTimeUnit": "HOURS"
  }
}

Rate-Limit

The Rate-Limit policy configures the number of requests allow over a limited period of time (from seconds to minutes). This policy does not prevent request spikes.

Property	Required	Description	Type	Default
Failover passthrough	No	In case of failure, the request is forwarded without rate limiting.	Boolean	false
Non-strict mode (async)	No	By activating this option, rate-limiting is applied in an asynchronous way meaning that the distributed counter value is not strict (backend can receive more queries than configured).	Boolean	false
Add response headers	No	Add X-Rate-Limit-Limit, X-Rate-Limit-Remaining and X-Rate-Limit-Reset headers in HTTP response.	Boolean	false
key	No	Key to identify a consumer to apply rate-limiting against. Leave it empty to use the default behavior (plan/subscription pair). Supports Expression Language.	String	null
limit	No	Static limit on the number of requests that can be sent (this limit is used if the value > 0).	integer	0
dynamicLimit	No	Dynamic limit on the number of requests that can be sent (this limit is used if static limit = 0). The dynamic value is based on Expression Language expressions.	string	null
periodTime	Yes	Static Time duration. This value must be empty to use the `dynamicPeriodTime`.	Integer	1
periodTimeUnit	Yes	Time unit ("SECONDS", "MINUTES" )	String	SECONDS
dynamicPeriodTime	No	This expression will be used when `periodTime` is empty. Supports EL. For The default time unit is `SECONDS`.	String	null

Configuration example

Static Time Duration

{
  "rate": {
    "limit": "1000",
    "periodTime": 1,
    "periodTimeUnit": "SECONDS"
  }
}

Dynamic Time Duration

{
  "rate": {
    "limit": "1000",
    "dynamicPeriodTime": "{#context.attributes['rate-limit-quota-period']}",
    "periodTimeUnit": "SECONDS"
  }
}

Spike Arrest

The Spike-Arrest policy configures the number of requests allow over a limited period of time (from seconds to minutes). This policy prevents request spikes by throttling incoming requests. For example, a SpikeArrest policy configured to 2000 requests/second will limit the execution of simultaneous requests to 200 requests per 100ms.

By default, the SpikeArrest policy is applied to a plan, not a consumer. To apply a spike arrest to a consumer, you need to use the key attribute, which supports Expression Language.

Property	Required	Description	Type	Default
Failover passthrough	No	In case of failure, the request is forwarded without spike arrest.	Boolean	false
Non-strict mode (async)	No	By activating this option, spike arrest is applied in an asynchronous way meaning that the distributed counter value is not strict (backend can receive more queries than configured).	Boolean	false
key	No	Key to identify a consumer to apply spike arresting against. Leave it empty to use the default behavior. Supports Expression Language (example: `{#request.headers['x-consumer-id']}`).	String	null
limit	No	Static limit on the number of requests that can be sent (this limit is used if the value > 0).	integer	0
dynamicLimit	No	Dynamic limit on the number of requests that can be sent (this limit is used if static limit = 0). The dynamic value is based on Expression Language expressions.	string	null
periodTime	Yes	Static Time duration. This value must be empty to use the `dynamicPeriodTime`.	Integer	1
periodTimeUnit	Yes	Time unit (`SECONDS`, `MINUTES`)	String	SECONDS
dynamicPeriodTime	No	This expression will be used when `periodTime` is empty. Supports EL. For The default time unit is `SECONDS`.	String	null

Configuration example

Static Time Duration

{
  "spike": {
    "limit": "1000",
    "periodTime": 1,
    "periodTimeUnit": "SECONDS"
  }
}

Dynamic Time Duration

{
  "spike": {
    "limit": "1000",
    "dynamicPeriodTime": "{#context.attributes['rate-limit-quota-period']}",
    "periodTimeUnit": "SECONDS"
  }
}

Token Bucket

The Token-Bucket policy throttles requests with the token bucket algorithm. A bucket holds up to a burst capacity and gains a fixed number of tokens each period. Each request takes one token. When the bucket is empty the request is rejected with 429. A fresh bucket starts full, so the first burst goes through right away, and after that traffic is held at the refill rate.

The refill rate is a whole-token count per period. refillRate tokens are added every refillPeriodTime refillPeriodTimeUnit (for example, 100 tokens per 1 MINUTE), and burstCapacity is the bucket size, which sets the largest burst allowed. All arithmetic is integer, so there are no fractional rates.

It runs on HTTP proxy APIs (V4, and V2 through the v4-emulation engine) and on V4 message APIs. On a message API it meters the request (publish) phase the same way as the rate-limit, quota and spike-arrest policies: one consume per phase, and when the bucket is empty the message stream is interrupted instead of returning an HTTP 429.

Property	Required	Description	Type	Default
burstCapacity	Yes*	Largest number of tokens the bucket can hold, which sets the biggest burst allowed. Provide either this static value (≥ 1) or its dynamic variant `dynamicBurstCapacity`.	Long	none
refillRate	Yes*	Number of tokens added to the bucket each refill period. Provide either this static value (≥ 1) or its dynamic variant `dynamicRefillRate`.	Long	none
refillPeriodTime	Yes	Length of the refill period, combined with the unit below.	Long	1
refillPeriodTimeUnit	Yes	Time unit of the refill period: `SECONDS`, `MINUTES`, `HOURS` or `DAYS`.	String	SECONDS
addHeaders	No	Add `X-Rate-Limit-Limit`, `X-Rate-Limit-Remaining`, `X-Rate-Limit-Reset` (and `Retry-After` on rejection) headers in the HTTP response.	Boolean	false
key	No	Key to identify a consumer to apply the bucket against. Leave it empty to use the default behavior (plan/subscription pair). Supports Expression Language.	String	null
useKeyOnly	No	Only use the custom key to identify the consumer, regardless of the subscription and plan.	Boolean	false
errorStrategy	No	Behaviour when the rate-limit store fails: `FALLBACK_PASS_TROUGH` lets the request through (fail open); `BLOCK_ON_INTERNAL_ERROR` rejects it (fail closed). The default is fail open, so while the store is unavailable every request passes through and throttling is effectively disabled — set `BLOCK_ON_INTERNAL_ERROR` if the store outage must not bypass the limit.	String	FALLBACK_PASS_TROUGH
async	No	By activating this option, rate-limiting is applied in an asynchronous (non-strict) way: the distributed bucket is approximate, so a backend can receive more requests than configured. See the note below.	Boolean	false

Note

The on burstCapacity and refillRate: each is required, but may be supplied *either as the static value above or through its dynamic Expression Language variant (dynamicBurstCapacity / dynamicRefillRate), evaluated per request. Because the JSON schema cannot express that either/or it declares no required fields; a configuration where neither form resolves to a positive value is rejected at request time with a 500 (TOKEN_BUCKET_RATE_LIMIT_SERVER_ERROR).

By default (async = false) the bucket is strict and exact: every request runs an atomic refill-and-consume against the store. With async = true each gateway node keeps its own local bucket and reconciles it to the store on a short interval. This gives higher throughput and far fewer store round-trips, but the distributed bucket is only approximate: across several nodes a backend can receive more requests than the configured rate. Even on a single node the local bucket is reconciled to the store only periodically, so within a reconcile window the node can admit more than the configured rate before the store corrects it — async is not request-for-request identical to strict mode. The setting mirrors the rate-limit policy’s async option.

Configuration example

{
  "burstCapacity": 100,
  "refillRate": 10,
  "refillPeriodTime": 1,
  "refillPeriodTimeUnit": "SECONDS",
  "addHeaders": true,
  "errorStrategy": "FALLBACK_PASS_TROUGH"
}

This allows an immediate burst of up to 100 requests, then a steady 10 requests per second. refillRate and burstCapacity also accept Expression Language through dynamicRefillRate and dynamicBurstCapacity, which are evaluated on each request when the static value is left unset.

Errors

Default response override

You can use the response template feature to override the default response provided by the policies. These templates must be defined at the API level (see the API Console Response Templates option in the API Proxy menu).

Error keys

The error keys sent by these policies are as follows:

Key	Parameters
RATE_LIMIT_TOO_MANY_REQUESTS	limit - period_time - period_unit
QUOTA_TOO_MANY_REQUESTS	limit - period_time - period_unit
SPIKE_ARREST_TOO_MANY_REQUESTS	limit - period_time - period_unit - slice_limit - slice_period_time - slice_limit_period_unit
TOKEN_BUCKET_RATE_LIMIT_TOO_MANY_REQUESTS	burst_capacity

Name		Name	Last commit message	Last commit date
Latest commit History 349 Commits
.circleci		.circleci
.github		.github
.vscode		.vscode
gravitee-gateway-services-ratelimit		gravitee-gateway-services-ratelimit
gravitee-policy-quota		gravitee-policy-quota
gravitee-policy-ratelimit		gravitee-policy-ratelimit
gravitee-policy-spikearrest		gravitee-policy-spikearrest
gravitee-policy-token-bucket-ratelimit		gravitee-policy-token-bucket-ratelimit
gravitee-ratelimit-shared		gravitee-ratelimit-shared
.gitignore		.gitignore
.prettierignore		.prettierignore
.prettierrc		.prettierrc
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.adoc		CONTRIBUTING.adoc
LICENSE.txt		LICENSE.txt
README.adoc		README.adoc
Taskfile.dist.yml		Taskfile.dist.yml
pom.xml		pom.xml
ratelimit.svg		ratelimit.svg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

RateLimit policy

Phase

Description

Compatibility with APIM

Configuration

Quota

Configuration example

Rate-Limit

Configuration example

Spike Arrest

Configuration example

Token Bucket

Configuration example

Errors

Default response override

Error keys

About

Uh oh!

Releases 26

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

RateLimit policy

Phase

Description

Compatibility with APIM

Configuration

Quota

Configuration example

Rate-Limit

Configuration example

Spike Arrest

Configuration example

Token Bucket

Configuration example

Errors

Default response override

Error keys

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 26

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages