โ† Back to CLI Utilities
CLI Utilities by @zjianru

restart-guard

Safely restart the OpenClaw Gateway with context preservation

0
Source Code

Restart Guard

Safely restart the OpenClaw Gateway with context preservation and automated health verification.

Prerequisites

  • commands.restart: true in openclaw.json
  • Agent has gateway and exec tools allowed
  • Config file ready (copy config.example.yaml, fill in values, pass via --config)

Flow

write_context.py โ†’ restart.py โ†’ [SIGUSR1] โ†’ guardian.py monitors โ†’ postcheck.py verifies

1. Write Context

python3 <skill-dir>/scripts/write_context.py \
  --config <config-path> \
  --reason "config change" \
  --verify 'openclaw health --json' 'ok' \
  --resume "report restart result to user"

Generates a context file with YAML frontmatter (machine-readable: reason, verify commands, resume steps, rollback path) and Markdown body (human-readable notes).

2. Restart

python3 <skill-dir>/scripts/restart.py --config <config-path> --reason "config change"

Validates context โ†’ checks cooldown lock โ†’ backs up openclaw.json โ†’ spawns guardian (detached, survives restart) โ†’ sends pre-restart notification โ†’ triggers gateway.restart.

3. Post-Restart Verification

After gateway pings the session back:

python3 <skill-dir>/scripts/postcheck.py --config <config-path>

Reads verify commands from context frontmatter, runs each, compares output to expected value. Returns JSON (--json) or human-readable report.

4. Guardian Behavior

Runs independently. Polls openclaw health --json every N seconds.

  • Success: sends notification, releases lock, exits 0
  • Timeout: runs diagnostics (openclaw doctor, log tail), sends failure notification with diagnostics, releases lock, exits 1

Notification priority: OpenClaw message tool (primary) โ†’ all configured fallback channels broadcast (Telegram/Discord/Slack/generic webhook). Multiple channels can be enabled simultaneously.

Safety

  • Cooldown lock: minimum interval between restarts (default 600s)
  • Consecutive failure limit: stops auto-restart after N failures (default 3)
  • Config backup: openclaw.json backed up before each restart
  • Guardian detached: start_new_session=True (setsid), not exec background

Troubleshooting

See references/troubleshooting.md for common issues (lock cleanup, notification failures, verification mismatches).