Report

vesper-journal daemon dual-schema parsing + close-flush bookkeeping

628a222e-9c71-4690-ba97-365b0820efa9

vesper-journal Python daemon snapshotting OpenClaw session JSONLs to Chronicle/Postgres had three stacked memory-capture failures: (1) cadence 300s but sessions often die <60s, (2) format_snapshot truncated message text to 120 chars before write so ~500 B reached chronicle, (3) no session-close flush or per-turn capture. Adding all four fixes uncovered two follow-on bugs: (A) the daemon now watches both ~/.openclaw/agents/main/sessions/ AND ~/.claude/projects/-home-bosh--openclaw-workspace/, but the two dirs use DIFFERENT JSONL schemas — openclaw uses {type: "message", message: {...}}, Claude Code uses {type: "user"|"assistant", message: {...}}. The single-schema parser silently dropped all cc-format messages, leaving close-flush rows at ~370 B (just headers). (B) The Claude Code projects dir contains 10,000+ historical JSONLs from months of use; the naive idle-close detector treated every old file as "needs flushing" and (1) flooded chronicle with thousands of bogus rows, (2) grew the persisted closed_sessions JSON state file to 424 KB containing ~10,608 session IDs. Need both a max-age filter on the close-flush scanner AND a separation between in-memory-only "ancient" cache (not persisted) and "actually flushed" set (persisted). Three coordinated changes: (1) Accept both wrapper schemas in extract_session_info and _extract_assistant_text — check rec.get("type") in ("message", "user", "assistant") and pull from msg = rec.get("message", {}) in all cases. Role falls back to the top-level type when msg.role is absent. (2) Add CLOSE_FLUSH_MAX_AGE_SECONDS env var (default 2h) — sessions whose mtime is older than the cutoff are skipped entirely by the idle-close scanner; they're treated as cold history, not a candidate for flush. (3) Track ancient sessions in a separate in-memory set (ancient_seen) that is NEVER persisted. Only sessions that actually get flushed get added to the persisted closed_sessions set. This keeps the JSON state file at <1 KB instead of 400+ KB. The signature of detect_and_flush_closed gained an ancient_seen: set[str] parameter; the early-skip branch on age > MAX_AGE adds to ancient_seen and continues, while the flush-success branch adds to closed and persists. Also discovered the systemd unit file at ~/.config/systemd/user/vesper-journal.service was hardcoding Environment=VESPER_JOURNAL_INTERVAL=300 and overriding the script default, so changing the script default alone was insufficient — must edit the service unit and daemon-reload. First test with cc do "say hello" produced a session_close row with length 369 B — header-only, no message body. Inspecting the JSONL revealed the cc format had top-level type="user"|"assistant" rather than "message", so the parser's if rec.get("type") != "message" early-return dropped everything. Second issue: log spam of "Session X idle for 2876314s firing close flush" for sessions last touched 33+ days ago, because every JSONL in the watched dir was iterated and the only filter was "have we seen this in last_seen_mtime yet?" — first observation skipped, second observation fired. ls of the dir showed 10,604 files. Added 2h max-age filter. Then discovered closed_sessions.json had grown to 424 KB after a single test cycle because the ancient-skip branch was still calling closed.add(sid) on the persisted set. Refactored to keep ancient cache in-memory only.