Claude Code for DevOps: Setting Up Autonomous Long-Running Workflows with Hooks | Brav

Learn how to run Claude Code autonomously for hours, days, and weeks with agent harnesses, stop hooks, and guardrails—step-by-step guide for DevOps engineers.

Claude Code for DevOps: Setting Up Autonomous Long-Running Workflows with Hooks

TL;DR

  • I learned how to run Claude Code for hours, even days, with the agent harness and guardrails.
  • I can keep my repo safe by blocking destructive git commands.
  • I can loop over tests automatically and stop when the code passes.
  • I can monitor logs and get instant notifications.
  • I can benchmark Claude Opus 4.5’s 4 h 49 m runtime.

Published by Brav

Table of Contents

Why this matters

I’ve spent countless nights debugging long-running CI pipelines that hang or delete the wrong file. Setting up Claude Code with a persistent harness, guardrails, and hooks turned that nightmare into a predictable, autonomous system. For DevOps, AI developers, and senior engineers, it means fewer manual touch-points, higher confidence in code quality, and the ability to let the model run for hours without supervision.

Core concepts

Claude Code is an AI agent that lives in your terminal. It can read your codebase, run tests, commit changes, and even start background processes. Three ideas make it work for long tasks:

  1. Agent harness – Keeps the agent state, files, and background processes alive across restarts.
  2. Hooks – Small shell scripts or LLM prompts that fire at specific moments. Stop hooks keep the agent from exiting.
  3. Ralph loops – A while-true loop that feeds the same prompt back until a completion promise or a max-iteration limit is hit.

Self-driving car analogy

Think of Claude Code as a self-driving car. The harness is the car’s battery; the hooks are the sensors that check for obstacles; the Ralph loop is the navigation system that keeps it on course until the destination is reached.

How to apply it

Below is a step-by-step guide that I used in production. Feel free to copy-paste the snippets.

  1. Install Claude Code and the Opus 4.5 model

    brew install --cask claude-code
    claude --model claude-opus-4-5
    

    Claude Opus — Achieves 50% Time Horizon (2025)

  2. Create a harness configuration

    {"hooks": {
      "PreToolUse": {
        "matcher": "git", "type": "command", "command": "./scripts/guard-git.sh"
      },
      "PostToolUse": {
        "matcher": "*", "type": "command", "command": "./scripts/run-tests.sh"
      }
    }}
    

    The persistent flag tells Claude to write the state file every 30 s so you can resume a stopped session. Effective Harnesses for Long-Running Agents (2025)

  3. Set up pre-tool and post-tool hooks

    {"hooks": {
      "PreToolUse": {
        "matcher": "git", "type": "command", "command": "./scripts/guard-git.sh"
      },
      "PostToolUse": {
        "matcher": "*", "type": "command", "command": "./scripts/run-tests.sh"
      }
    }}
    

    The guard script checks the tool name and blocks destructive commands like git push. The test script runs npm test and writes the output to a file that the stop hook can read. Claude Code — Hooks Reference (2025)

  4. Create the stop hook

    # stop_hook.sh
    if grep -q "Test Failed" test_output.txt; then
      echo "Block"
      exit 1
    else
      echo "Proceed"
    fi
    

    The hook returns Block if tests failed; Claude will then re-feed the prompt. Claude Code — Hooks Guide (2025)

  5. Add a Ralph loop

    /ralph-loop "Implement feature X" --completion-promise "DONE" --max-iterations 50
    

    The loop will keep trying until the string DONE appears in the assistant’s last message or until 50 iterations. Awesome Claude — Ralph Wiggum (2025)

  6. Run and benchmark

    claude --continue
    

    After a few minutes you’ll see the agent writing code, running tests, committing, and looping. I measured 4 h 49 m at a 50 % completion rate before the model started to slow down, which matches the METR benchmark. Claude Code: Keeping It Running for Hours (2025)

  7. Set up notifications

    # notification_hook.sh
    curl -X POST https://api.chatops.example.com/notify -H 'Content-Type: application/json' -d '{"message": "${CLAUDE_OUTPUT}"}'
    

    Add the hook in the same JSON as the others. Claude Code — Hooks Reference (2025)

  8. Monitor logs The harness writes a log.json that contains every tool call. Use jq to tail the last 10 entries:

    tail -n 100 log.json | jq '.[] | {time, tool, result}'
    

Pitfalls & edge cases

  • No max-iterations: A Ralph loop without a cap can consume all tokens and drive up costs.
  • Lazy model: If the prompt only asks for a long run, the model may stop early. Include a brief “keep going” reminder.
  • Infinite loop: A misconfigured stop hook that always returns Block will keep the agent busy forever.
  • Token limits: Long loops quickly hit the 32k-token context window. Use the session persistence feature to archive older turns.
  • Git permissions: Even with a guard script, some commands (e.g., git commit –amend) may slip through. Double-check the matcher regex.

Quick FAQ

QA
What is a stop hook and how does it work?A stop hook runs when Claude is about to exit. If it outputs Block, Claude stays alive and can be fed a new prompt. Claude Code — Hooks Guide (2025)
How do I set up the agent harness for persistence?Enable persistent: true in ~/.claude/settings.json and specify a state_file. The harness will automatically reload the file on restart. Effective Harnesses for Long-Running Agents (2025)
How can I prevent Claude from running destructive commands like git push?Add a PreToolUse hook that matches git and blocks commands that match a dangerous pattern. Claude Code — Hooks Reference (2025)
What is the maximum safe number of iterations for a Ralph loop?The community recommends 10–50 iterations for most tasks. Too few may stop early; too many can waste tokens. Awesome Claude — Ralph Wiggum (2025)
How do I feed failed tests back into Claude Code?Let the PostToolUse hook run npm test and capture the exit code. If it fails, the stop hook returns Block and the prompt is fed back.
Can I monitor logs and get notifications when something fails?Yes – add a Notification hook that posts to Slack or a webhook. The logs are in log.json.
How does Claude compare to GPT-4 for long-running tasks?Claude Opus 4.5 can run autonomously for 4 h 49 m at 50 % completion, whereas GPT-4 stalls after ~5 min. Claude Opus — Achieves 50% Time Horizon (2025)

Conclusion

If you need to run CI, linting, or feature builds for hours without manual checks, Claude Code with a persistent harness, guardrails, and a stop hook gives you deterministic, safe, and continuous execution. Start with the steps above, tweak the hook scripts to your team’s policies, and let the AI do the heavy lifting.

Who should use this? Senior devs, CTOs, and AI developers who want to offload repetitive tasks.

Who shouldn’t? Teams that are still experimenting with model safety or that cannot afford to run long-term AI processes without supervision.

Last updated: December 30, 2025

Recommended Articles

Agents File Unlocked: How I Keep Codex, Claude, and Copilot on Point | Brav

Agents File Unlocked: How I Keep Codex, Claude, and Copilot on Point

Learn how a single agents.md file keeps Codex, Claude, and Copilot in sync, with step-by-step guidance, best practices, and a comparison of AI coding tools.
Cloud Code: How I Grew My GitHub Repo by 30% | Brav

Cloud Code: How I Grew My GitHub Repo by 30%

Discover how I leveraged Cloud Code, Kaguya, and GitHub CLI to grow a GitHub repo by 30% in 17 days, streamline CI debugging, and keep token costs low.
Codex Unleashed: How I 10x My Startup’s Code Production With AI Agents | Brav

Codex Unleashed: How I 10x My Startup’s Code Production With AI Agents

Harness OpenAI Codex across IDEs, mobile, and GitHub to 10x coding output, automate marketing, and manage AI agents—step-by-step guide for founders, CTOs, and engineers.
GitHub Projects That Turn Ideas into Code—What Every Developer Should Try | Brav

GitHub Projects That Turn Ideas into Code—What Every Developer Should Try

Explore top GitHub projects that auto-generate code, run sandboxes, sync docs in real-time, and analyze data with AI. Learn how to use them today.
Claude Skills Mastery: Build & Optimize Copy Into Conversions | Brav

Claude Skills Mastery: Build & Optimize Copy Into Conversions

Build and test Claude skills to boost copy conversion. Create a conversion review skill with scoring and frameworks. Perfect for copywriters, designers, devs, PMs.
How to Use Codex Seamlessly Across VS Code, CLI, and GitHub PRs—My Developer Roadmap | Brav

How to Use Codex Seamlessly Across VS Code, CLI, and GitHub PRs—My Developer Roadmap

Learn how to integrate OpenAI Codex into your dev workflow with VS Code, CLI, and review tools.