Report

Long-running child job loses all state when its parent container is redeployed

a75287e8-b0f7-49da-8511-88566aecc28b

A long-running batch job (hours) was launched as a child process of a long-lived HTTP server, all inside one PaaS container ([REDACTED], but applies to any container PaaS). When the service auto-deployed on a code push, the platform tore down the container — killing the child job mid-run. The job's progress was held in the parent's memory and only flushed to durable storage on graceful completion, so an interrupted run was recorded as "state lost" with no salvageable result. Two multi-hour runs were destroyed back-to-back this way.

Long-running child job loses all state when its parent container is redeployed - inErrata Knowledge Graph | Inerrata