Token Bucket Options¶
TokenBucketOptions configures the endpoint-keyed token-bucket limiter used by
Nalix.Runtime.Throttling.TokenBucketLimiter. The limiter enforces burst
capacity, sustained refill rate, soft throttling, optional hard lockout, endpoint
tracking caps, sharded storage, cleanup, and diagnostic report sizing.
Source Mapping¶
src/Nalix.Runtime/Options/TokenBucketOptions.cssrc/Nalix.Runtime/Throttling/TokenBucketLimiter.cssrc/Nalix.Hosting/Bootstrap.cs
Defaults and Validation¶
| Property | Default | Valid range | Runtime effect |
|---|---|---|---|
CapacityTokens |
12 |
1..int.MaxValue |
Maximum whole-token burst capacity for each endpoint. |
RefillTokensPerSecond |
6.0 |
0.001..double.MaxValue |
Sustained refill rate converted to fixed-point microtokens. |
HardLockoutSeconds |
0 |
0..int.MaxValue |
Stopwatch-tick duration for hard lockout after soft violations escalate. |
StaleEntrySeconds |
300 |
1..int.MaxValue |
Idle age before an endpoint can be removed by stale cleanup. |
CleanupIntervalSeconds |
120 |
1..int.MaxValue |
Recurring cleanup cadence registered with TaskManager. |
TokenScale |
1000 |
1..1000000 |
Fixed-point units per whole token. One consumed request costs TokenScale. |
ShardCount |
32 |
1..int.MaxValue; power of two |
Number of endpoint-map shards used by hash masking. |
SoftViolationWindowSeconds |
5 |
1..int.MaxValue |
Window used to accumulate soft-throttle violations. |
MaxSoftViolations |
3 |
1..int.MaxValue |
Violation count at which the endpoint escalates to hard lockout. |
CooldownResetSec |
10 |
1..int.MaxValue |
Configured and reported as cooldown policy; not read by current limiter logic. |
MaxTrackedEndpoints |
10000 |
0..int.MaxValue |
Maximum tracked endpoints; 0 disables the cap. |
InitialTokens |
-1 |
No DataAnnotation range | Initial tokens for new endpoints: <0 full, 0 empty, >0 clamped to capacity. |
MaxEvictionCapacity |
4096 |
64..65536 |
Initial pooled candidate-list capacity cap for over-limit eviction scans. |
MinReportCapacity |
256 |
64..8192 |
Minimum pooled snapshot-list capacity used by report generation. |
Validate() first runs DataAnnotation validation and then enforces:
ShardCountmust be positive;ShardCountmust be a power of two;CapacityTokens * TokenScalemust fit inInt64;InitialTokensmust be<= CapacityTokens.
Construction and Fixed-Point Materialization¶
TokenBucketLimiter accepts an optional TokenBucketOptions instance. If none is
provided, it loads the singleton from ConfigurationManager and validates it.
Construction materializes hot-path constants:
_capacityMicro = (long)_options.CapacityTokens * _options.TokenScale;
_refillPerSecMicro = (long)Math.Round(_options.RefillTokensPerSecond * _options.TokenScale);
_initialBalanceMicro = CalculateInitialBalance();
Each endpoint state stores balances in fixed-point microtokens. A request consumes
exactly _options.TokenScale; whole-token credit in responses and reports is computed
by integer-dividing the microtoken balance by TokenScale and clamping to ushort.
Bootstrap materialization
Bootstrap.Initialize() currently contains the TokenBucketOptions materialization
call as a commented line. In the current source, these options are loaded when a
TokenBucketLimiter instance is constructed, unless a caller passes options
explicitly.
Endpoint Sharding¶
The limiter creates ShardCount shard maps and selects a shard from the endpoint's
GetHashCode() after a Murmur-style finalizer mix:
This bit-mask selection is why ShardCount must be a power of two. It avoids modulo
costs on the hot path while preserving stable endpoint partitioning.
Evaluate Flow¶
Evaluate(INetworkEndpoint key) is the primary runtime API:
flowchart TD
A["Evaluate(endpoint)"] --> D{"limiter disposed?"}
D -- yes --> X["deny: HardLockout"]
D -- no --> V["validate endpoint"]
V --> S["select shard"]
S --> G["get or create endpoint state"]
G --> L{"tracking limit reached?"}
L -- yes --> R["deny: HardLockout + cap retry"]
L -- no --> H{"hard blocked?"}
H -- yes --> B["deny: HardLockout + retry"]
H -- no --> F["refill tokens"]
F --> C{"balance >= TokenScale?"}
C -- yes --> O["consume token + allow"]
C -- no --> T["record soft violation"]
T --> E{"violations >= MaxSoftViolations?"}
E -- yes --> K["set hard block + deny"]
E -- no --> M["deny: SoftThrottle + retry"]
Endpoint validation requires a non-null endpoint and a non-empty Address.
Initial Tokens¶
CalculateInitialBalance() implements the InitialTokens semantics:
< 0: start new endpoints at full capacity (CapacityTokens * TokenScale);0: start empty for cold-start behavior;> 0: convert to microtokens and clamp to[0, capacity].
The default -1 therefore allows a new endpoint to spend the full burst immediately.
Soft Throttle and Hard Lockout¶
When an endpoint has insufficient tokens, the limiter computes the required delay from the missing microtokens and the refill rate. It then records a soft violation:
- if the previous violation is within
SoftViolationWindowSeconds, incrementSoftViolations; - otherwise reset
SoftViolationsto1; - update
LastViolationSwto the current stopwatch tick.
If SoftViolations >= MaxSoftViolations, the endpoint escalates to hard lockout:
HardLockoutSeconds = 0 therefore produces a zero-duration hard-lock timestamp. The
escalation decision can still return HardLockout, but the next evaluation will not
remain blocked because the stored deadline is not greater than the new now tick.
Cooldown reset option
CooldownResetSec is present in TokenBucketOptions, appears in the INI comment,
and is included in high-level policy intent, but the current TokenBucketLimiter
implementation does not read it. Soft violations reset by the soft violation window;
hard lockout expiry is controlled by HardLockoutSeconds.
Endpoint Tracking Cap¶
MaxTrackedEndpoints has two enforcement points:
- before allocating a new endpoint state, the limiter rejects if the current count is already at or above the cap;
- after a successful concurrent add, it double-checks the count and removes the newly added endpoint if the cap was exceeded.
When the cap is reached, evaluation returns a denial decision instead of creating more
state. If MaxTrackedEndpoints <= 0, the cap is disabled and endpoint tracking can
grow until cleanup removes stale entries.
Cleanup and Eviction¶
Construction schedules a recurring cleanup job:
- name:
TaskNaming.Recurring.CleanupJobId("token.bucket", hash); - interval:
CleanupIntervalSeconds; NonReentrant = true;- tag:
TaskNaming.Tags.Service; - jitter:
250 ms; - execution timeout:
2 s; - backoff cap:
15 s.
The cleanup body uses an internal CancellationTokenSource with a 5 s timeout. It:
- removes stale endpoints where
now - LastSeenSw > StaleEntrySeconds; - rotates the starting shard each cleanup run so timeout-limited scans do not always
favor shard
0; - enforces
MaxTrackedEndpointsif the count remains above the cap.
Over-limit eviction collects candidate endpoints into a pooled list with initial capacity:
It sorts candidates by LastSeen and removes the oldest entries first. The current
source uses MaxEvictionCapacity as the pooled list's initial capacity cap; it does
not cap the number of candidates added to that list during collection.
Diagnostics and Reporting¶
GetReport() collects a pooled snapshot of endpoint states with initial capacity at
least MinReportCapacity, counts currently hard-blocked entries, and renders the top
20 endpoints by pressure. The report header includes:
CapacityTokens;RefillPerSecond;TokenScale;Shards;HardLockoutSeconds;StaleEntrySeconds;CleanupIntervalSecs;MaxTrackedEndpoints;TrackedEndpoints;HardBlockedCount.
GetStats() exposes the same core option values and runtime counts as structured
data.
Disposal Contract¶
Dispose() marks the limiter disposed and cancels its recurring cleanup job via
TaskManager.CancelRecurring(). After disposal, Evaluate() does not throw; it
returns a denied RateLimitDecision with reason HardLockout and zero retry delay.
DisposeAsync() delegates to Dispose() and returns ValueTask.CompletedTask.
Tuning Guidance¶
- Use
CapacityTokensfor burst tolerance andRefillTokensPerSecondfor sustained throughput; both are per endpoint. - Keep
ShardCounta power of two and scale it with expected concurrent endpoint cardinality to reduce per-shard contention. - Set
HardLockoutSeconds > 0only when repeated soft throttles should become a persistent block. - Avoid
MaxTrackedEndpoints = 0on public servers unless another layer bounds endpoint cardinality. - Treat
MaxEvictionCapacityas a cleanup allocation sizing knob, not as a strict per-cycle removal limit in the current implementation.