WTF Is a Loop：那个被喊烂的词到底指什么

2026 年 6 月 7 日，Peter Steinberger 发了一条只有 6 个词的推，48 小时内清出 2.2M 浏览：

"You shouldn't be prompting coding agents anymore. You should be designing loops that prompt your agents."

reply 区直接打起来了。最有代表性的两条：Varadh Jain 问「实践中到底长什么样？」，Matthew Berman 回了句「nobody knows but him and boris」。

Matt Van Horn（Lyft 联创）做了一件别人没做的事——用 /last30days 给这词做了考古，翻了 15 条 Reddit、21 条 X、6 个 YouTube transcript、12 个 Hacker News 故事、6 个 GitHub 仓库。然后他给出了一份「Loop 这个词到底在指什么」的判定文档。

先给一句话定义

Loop = cron + 一个 body 里的决策者。

调度层确实是 cron，Boris Cherny 自己就跑在 cron 上，Claude Code 的 /loop 底层也是 cron。如果你对 loop 的全部定义是「按时间表跑的东西」，那这是 1975 年就有的东西，你可以回家了。

cron 永远没有的是「中间那格」。一个 cron job 跑一段固定脚本。一个 loop 跑一个模型——读当前状态、决定下一步、动手做、检查有没有成、决定要不要继续。决策是模型做的，不是你写的硬分支。把这种 loop 叠起来，让一个 loop dispatch 并 supervise 别的 loop，给它们持久的共享状态，cron 表达不出这种东西。

诚实的说法不是「loop 是新魔法」也不是「loop 就是 cron」。是 「loop 是 cron 加上一个 body 里的决策者」，所有有趣的工程是给那个决策包一层东西，让它不要跑飞。

Loop 的五段历史：从 ReAct 到 orchestration

「Loop」这个词 hide 至少五件不同的事，按从老到新排：

Stage 1：2022 ReAct 论文的学术 while-loop。 模型推理、调工具、读结果、重复到 done。一个模型、一个 loop、一个人盯着。

Stage 2：2023 AutoGPT。 给它一个目标让它自己 prompt 自己——出名是因为「转一辈子啥都不干」。这个失败给后来几年的「agent 是玩具」叙事种了根。

Stage 3：2025 年 7 月 Geoffrey Huntley 发的 ralph loop。 一个 bash 一行命令把同一个 prompt 文件反复喂给 agent。真正的新意是「纪律」：每次迭代把 context 重置到一组固定 anchor 文件，不让对话自己长。Huntley 用它造了一门完整的编程语言，成本约 297 美元。

Stage 4：2026 年春，Codex 和 Claude Code 都发了 /goal——跑 ralph loop 直到一个小验证模型确认任务完成。产品化。

Stage 5：Boris 和 Steinberger 实际指的那个，真正新的、不是换名字的。 四件事变了：

Loop 成了 work 的单位，不再是 task
Loop 开始 supervise 别的 loop，并发、按 schedule
Schedule 替代了人的 kickoff——loop 跑在「基础设施时间」上，不是你的注意力上
Durability 被显式化：git-backed state + crash recovery，因为这些 loop 必须能 restart 之后活下来

ralph 假设你的终端一直开着。2026 版的 loop 假设它不会。所以 Trash Panda（指 Varadh Jain 的那条 reply）对了两次：单 agent ralph loop 是 old hat，架在它上面的多 agent orchestration loop 是新层。

上车指南：一行命令

Claude Code 发了 /loop，Boris 自己的例子是 canonical starter：

/loop babysit all my PRs. Auto-fix build issues, and when comments come in, use a worktree agent to fix them.

几天后 Boris 又发了 5 条跑 Opus 数小时到数天的 tip：

auto mode for permissions——别让 Claude 每步问一次
dynamic workflows——让 Claude orchestrate 数百到数千个 agent 完成一个任务
/goal 或 /loop——让 Claude 持续到完成
Claude Code in the cloud——笔记本合上它还在跑
让 Claude 有办法 end-to-end self-verify

第 5 条是 hype 跳过、practitioner 抓耳的——loop 有多可信，取决于它检查自己工作的能力。

深水区是 Steve Yegge 一月份发的 Gas Town：20-30 个 Claude Code instance，由一个 Mayor agent 协调，patrol agent 跑持续 loop，状态存 git 让 work 能 crash 之后活下来。这就是 Trash Panda 当时 reach 不到的「continuous orchestration loop that oversees other threads」——已经发货了、已经开源。

真正的成本中心：Loop 管理，不是 token

整条 thread 最精彩的部分是它从哲学滑进了账单。

一个 working engineer 写了一句把 agents 神话戳破的话：

"Every ai agent i shipped this year is a for-loop, an llm call, and a try/catch around the json parsing. The only thing agentic about it is the anthropic bill at the end of the month."

那张账单不是玩笑。月度数字：Uber 给每个工程师在 Claude Code 和 Cursor 上每月每工具 1500 美元封顶——年度 AI 预算 4 个月烧完。

"The costliest thing in AI coding is no longer writing code, it's managing the agent loop."

生产环境最怕的失败模式是「不停止的 loop」：

"Without guardrails, you get infinite loops and billing surprises orders of magnitude over budget."

所以 2026 年所有严肃的 loop 文档收敛到同一组 hard stop：最大迭代次数、no-progress detection、token / 美元预算上限。浪漫版 loop 是「写完一圈 loop，一千个 agent 一夜帮你建公司」。生产版是「写完一圈 loop，你大部分工作是确保它能 halt」。Gartner 把 agentic AI 放在期望膨胀峰上，只有约 17% 的组织真在部署 agent。timeline 和 receipt 之间的 gap 才是真实状态。

真正在 compounding 的是 skill，不是 loop

Matt 自己的 take 落在这里：

"The loop is plumbing. The asset is the skill it calls."

Steinberger 的另一个常驻观点和 loop 配对、更耐久：如果你把同一件事做了不止一次，把它做成一个 skill；如果你做了一件难的事，事后再把它做成 skill，下次免费。一个 loop 里没有 reusable skill 的话，就是 while true 套了个陌生人。一个 loop 调的是一个「锋利、测过、有名字的 skill 库」，那才是一个会 compounding 的系统。

Reddit 上真在 convert 的从业者说得最准：

"A lot of people are rolling their eyes on Twitter, but my ears are perked up."

给「WTF is a loop」的四行答案

Matt 用四行字收了整条 thread。照搬：

A loop is cron plus a decision-maker in the body——模型而非硬分支决定下一拍。
Lineage is real：2022 ReAct → 2023 AutoGPT → 2025 ralph → spring 2026 /goal → orchestration。单 agent ralph 是 old hat，多 agent supervision 是新层。
Loop 有多好，取决于它反馈有多好。Continuous review 和 validation gate 才是 loop 可信的原因。
贵的资源从 token 移到了 loop 管理。Cap iteration、detect no-progress、set a dollar budget。Loop 内部真正能 compounding 的单位是 skill，不是 prompt。Loop 调用锋利命名的 skill 才会复利；loop 每次重新推导一切只是烧钱。

Steinberger 和 Boris 在描述同一个动物的两侧。真正懂的人是已经搭过一个的人。好消息是这个月，上车是一条斜杠命令。