Skip to content

Claude Code's Dynamic Workflows: 750k lines migrated in 11 days

Anthropic added a Dynamic Workflows orchestration layer to Claude Code. The headline example: a 750k-line Bun codebase ported from Zig to Rust in 11 days.

Source verified
  1. [01] Anthropic — Claude updates (Releasebot digest)
  2. [02] Anthropic — Introducing Claude Opus 4.8
  3. [03] Anthropic — Agentic coding and persistent returns to expertise

Anthropic added a new orchestration layer to Claude Code called "Dynamic Workflows." The one-line summary: the agent can now run not a single task but a multi-step migration at codebase scale, end to end — using the test suite as its own bar for success.

This is the difference between "write me this function" and "port this 750,000-line project to another language." The second has been human-team work for years.

What was announced

Dynamic Workflows is an orchestration capability in Claude Code that runs with Opus 4.8. The idea: you hand it a large goal (say, "migrate this module"), the agent breaks the work into subtasks, executes them in order, and at each step checks whether it still passes the existing test suite. So a human isn't defining success at every step by hand; the criterion is the project's own tests.

Anthropic's flagship example is striking: Jarred Sumner used Dynamic Workflows to port the Bun runtime's roughly 750,000-line codebase from Zig to Rust. The result merged in 11 days with 99.8% of tests passing. Per Anthropic, this was a project that could have taken a dedicated team 6–12 months.

What changed

Claude Code used to be powerful but essentially "session-based": you steered, it helped step by step. When I wrote about what Opus 4.8 brought, I covered the jump on speed and cost; Dynamic Workflows adds a "duration" dimension on top.

• Codebase scale: it runs migrations of hundreds of thousands of lines from kickoff to merge in a single flow.

• Test-driven stopping point: the agent learns when it's "done" from the test suite, not from a human.

• Enterprise boundary: Claude Managed Agents can now run in a sandbox you control and connect to private MCP servers. Both the environment where the agent runs tools and the services it reaches stay inside the enterprise boundary.

• Central authorization: enterprise-managed MCP connector access arrived, starting with Okta; admins provision a connector once and get centralized authorization across Claude chat, Claude Code, and Cowork.

My first impression

To be honest: I don't have a project where I can test a 750k-line migration on my own machine, so I haven't verified that number with my own eyes. But this is exactly the point I underlined when I wrote about which coding agent I pick as a solo builder: the value of agents is measured not in a single prompt but in long-horizon reliability.

What actually interests me isn't "11 days" but the "99.8% tests passing" figure. Because the hard part of a migration isn't translating the code — it's not breaking behavior while you translate it. Making the test suite the success bar ties the agent's hallucination to a guardrail. In my own projects, I wouldn't trust a big migration before trying it on a small module first, with a real test suite, and measuring the outcome.

Practical impact

For the solo maker, the concrete takeaway: a language/framework migration can stop being a "backlog item whose turn never comes" and become plannable work. The "move that old module to a modern stack" jobs I've deferred for years become reachable — if there's a good test suite.

But the condition is clear: if your test suite is weak, this feature is dangerous for you. The agent will say "done" the moment tests pass; behavior the tests don't cover can break silently. So this is a feature that makes writing tests more valuable, not one that replaces it.

Limits / concerns

Anthropic's example is a single success case, not an average result. The Bun migration was a well-tested, mature codebase. A typical enterprise monolith may not have that test coverage, and in that case numbers like 99.8% are hard to see.

Cost is an open question too: a multi-step flow at codebase scale burns serious tokens. Fast mode helps on speed, but the bill for these long tasks isn't easy to predict ahead of time. Going into big migrations without starting from a small pilot and measuring unit cost is risky.

A note from me

This announcement reminded me of the industry's quiet change of direction: the race is shifting from "the smartest single answer" to "the most reliable long task." What's starting to matter isn't how brightly a model shines in one prompt, but how many hours-long jobs it finishes without breaking.

For me the exciting part is the same as the scary part: teams with test discipline get a clear runway, and teams without it automate their fragility. I care about the habit, not the tool — and this feature rewards exactly that habit.