Loop engineering shifts the focus from crafting instructions to designing self-driving systems where the agent prompts itself, corrects its own work, and runs autonomously. The real unlock: running those loops on an always-on agent like Hermes.
Instead of crafting instructions yourself, you design a system that does the prompt engineering for you. The agent defines its own steps, checks its own output, and iterates until complete.
Peter Steinberger (OpenClaw) and Boris (Claude Code) both made the same claim — they don't prompt Claude anymore. Loops do it for them. The human sets the goal and the constraints; the loop handles the iteration.
A loop without these components will either spin forever, quit prematurely, or degrade under its own context weight. Here's the minimum viable loop.
Keep important instructions from getting buried under tool outputs.
Test runs, screenshots as signal — garbage in, garbage out.
Clear checkpoints that must pass before the loop proceeds.
Explicit rules so the agent doesn't quit too early or loop forever.
External files to track progress as context grows beyond limits.
Not all loops are created equal. The loop design depends entirely on whether "done" is a binary state or a subjective judgment.
For tasks where success is objectively measurable.
Simple, cheap, and reliable. Point the agent at your test suite and let it fix failures.
For tasks where "good" is in the eye of the beholder.
Needs a separate verifier model to judge quality. More expensive but unlocks subjective autonomy.
The most powerful loop design uses two different models in an adversarial setup. One builds, one verifies, and the loop runs until the verifier is satisfied.
The builder's outputs get checked by the verifier. The loop runs until the verifier is satisfied. Hermes self-evolving skills let the verifier get stronger every time you spot something it missed.
Using the same model to build and verify creates blind spots. Cross-model verification catches failures that the builder's own biases would miss. Claude is great at generation; GPT is great at evaluation. Using both exploits their complementary strengths.
Loops need a host that never sleeps. Hermes is always-on — it can watch your deployed app, catch commits that break production, and fix issues without you stepping in.
Unlike a human who sleeps, Hermes runs loops 24/7. Production incidents at 3 AM? The loop is already fixing them before you wake up.
Every time you spot something the verifier missed, you update the skill. The verifier gets stronger over time, reducing the need for human oversight.
--goal command
replaced hooks with a more flexible model-based check.
For deterministic work, point an always-on agent at your tests and let it fix failures autonomously. For subjective work, pair a builder model with a separate verifier model in an adversarial loop and let the verifier evolve over time.