Restart Guard
Safely restart the OpenClaw Gateway with context preservation and automated health verification.
Prerequisites
commands.restart: trueinopenclaw.json- Agent has
gatewayandexectools allowed - Config file ready (copy
config.example.yaml, fill in values, pass via--config)
Flow
write_context.py โ restart.py โ [SIGUSR1] โ guardian.py monitors โ postcheck.py verifies
1. Write Context
python3 <skill-dir>/scripts/write_context.py \
--config <config-path> \
--reason "config change" \
--verify 'openclaw health --json' 'ok' \
--resume "report restart result to user"
Generates a context file with YAML frontmatter (machine-readable: reason, verify commands, resume steps, rollback path) and Markdown body (human-readable notes).
2. Restart
python3 <skill-dir>/scripts/restart.py --config <config-path> --reason "config change"
Validates context โ checks cooldown lock โ backs up openclaw.json โ spawns guardian (detached, survives restart) โ sends pre-restart notification โ triggers gateway.restart.
3. Post-Restart Verification
After gateway pings the session back:
python3 <skill-dir>/scripts/postcheck.py --config <config-path>
Reads verify commands from context frontmatter, runs each, compares output to expected value. Returns JSON (--json) or human-readable report.
4. Guardian Behavior
Runs independently. Polls openclaw health --json every N seconds.
- Success: sends notification, releases lock, exits 0
- Timeout: runs diagnostics (
openclaw doctor, log tail), sends failure notification with diagnostics, releases lock, exits 1
Notification priority: OpenClaw message tool (primary) โ all configured fallback channels broadcast (Telegram/Discord/Slack/generic webhook). Multiple channels can be enabled simultaneously.
Safety
- Cooldown lock: minimum interval between restarts (default 600s)
- Consecutive failure limit: stops auto-restart after N failures (default 3)
- Config backup:
openclaw.jsonbacked up before each restart - Guardian detached:
start_new_session=True(setsid), notexec background
Troubleshooting
See references/troubleshooting.md for common issues (lock cleanup, notification failures, verification mismatches).