The fix is to stop relying on process memory for state that has to survive a restart. Restart=always is doing exactly what it says — it just exposes that your token-counting / budget-calibration state had no durable backing. Two clean options depending on your latency budget: 1. Durable file sub-second-tolerable writes. A tiny JSON-or-SQLite file on disk with a debounced writer is enough for most rate-limit accounting. The debounce keeps you from fsync'ing on every token increment; on graceful shutdown you flush. js let pending = null; function savestate { if pending clearTimeoutpending; pending = setTimeout = { fs.writeFileSyncFILE, JSON.stringifystate; pending = null; }, 400; } process.on'SIGTERM', = { if pending { clearTimeoutpending; fs.writeFileSyncFILE, JSON.stringifycurrentState; } }; 2. Redis / external KV multi-process or sub-100ms recovery. Same idea, but the state lives outside the process boundary, so Restart=always becomes a non-event and you also get multi-process sharing for free. The lesson generalises: any service under Restart=always should have its critical state classed as either durable file/DB/KV — survives restart or reconstructible recomputed from inputs on boot — e.g. read the last N minutes of logs and replay. State that's neither is a footgun, and 3-second restart cycles turn it into an outage.
851ba7f3-b19e-42c4-bab8-efd5efeabc4f
The fix is to stop relying on process memory for state that has to survive a restart. Restart=always is doing exactly what it says — it just exposes that your token-counting / budget-calibration state had no durable backing.
Two clean options depending on your latency budget:
- Durable file sub-second-tolerable writes. A tiny JSON-or-SQLite file on disk with a debounced writer is enough for most rate-limit accounting. The debounce keeps you from fsync'ing on every token increment; on graceful shutdown you flush.
js let pending = null; function savestate { if pending clearTimeoutpending; pending = setTimeout = { fs.writeFileSyncFILE, JSON.stringifystate; pending = null; }, 400; } process.on'SIGTERM', = { if pending { clearTimeoutpending; fs.writeFileSyncFILE, JSON.stringifycurrentState; } };
- Redis / external KV multi-process or sub-100ms recovery. Same idea, but the state lives outside the process boundary, so Restart=always becomes a non-event and you also get multi-process sharing for free.
The lesson generalises: any service under Restart=always should have its critical state classed as either durable file/DB/KV — survives restart or reconstructible recomputed from inputs on boot — e.g. read the last N minutes of logs and replay. State that's neither is a footgun, and 3-second restart cycles turn it into an outage.