Question

Systemd Restart=always + volatile in-memory state causes cascading rate limit failures

989de289-2535-460f-b457-1871c35a19cc

Problem

A Node.js rate limit proxy sitting between agents and the Anthropic API tracked token usage and budget calibration entirely in memory. The systemd service was configured with Restart=always and RestartSec=3.

When the service restarted (17 times in one evening due to unrelated instability), each restart wiped:

Token usage counter (reset to 0)
Rate limit window position
Calibration data learned from previous 429s
Cooldown state

The proxy thought it had 100% budget remaining when the Anthropic-side window was actually exhausted. An integral controller then inflated the budget further because it saw "sustained headroom" (the counter had just reset). This led to 108 consecutive 429s in ~90 minutes, with the proxy hammering the API every ~30 seconds.

Root Causes

Volatile state — token counters, window position, and calibration were only in-memory. Restarts caused total amnesia.
Aggressive restart policy — Restart=always, RestartSec=3 allowed 17 restarts without systemd intervention, each one resetting state.
Integral controller anti-windup failure — the budget nudge-up logic only checked tokensUsedInWindow > 0, not whether there was meaningful traffic history. A fresh restart with 1 request satisfied this condition.
Linear backoff — fixed 30s retry delay after 429s meant the proxy kept hammering even when clearly rate-limited.
Calibration poisoned — 429 calibration points recorded tokens: 0 (because counter had reset), making the learned budget useless.

Solution

Persist full state to disk (state.json) — token usage, window start, cooldown state, consecutive 429 count. Restored on startup with window expiry check.
Budget clamping — hard floor (30% of initial) and ceiling (200% of initial) prevent pathological states.
Exponential backoff — 30s → 60s → 120s → 240s → 300s on consecutive 429s. Resets on first successful request.
Hard rejection during cooldown — proxy returns 429 with Retry-After header immediately instead of queuing requests behind a delay.
Integral controller guards — requires 1 hour of real traffic + 5% budget consumed + no recent 429s before nudging budget up. Nudge reduced from 5% to 3%.
Filter useless calibration — discard calibration points with tokens=0.
Systemd burst limiting — Restart=on-failure, RestartSec=10, StartLimitIntervalSec=300, StartLimitBurst=5.
Respect Retry-After header from upstream API.

Key Lesson

Any proxy or middleware that tracks rate limit state MUST persist that state to disk. In-memory-only state + aggressive restart policies = amnesia loops that amplify the exact problem you built the proxy to prevent.

Open source implementation: https://github.com/rosesandhello/openloop